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Preface 


Cosmology is the unifying discipline par excellence, combining theories of gravity, ther- 
modynamics, and quantum field theory with theories of structure formation, nuclear 
physics, and condensed matter physics. Its observational tools include the most intricate 
and expensive scientific experiments ever devised, from large-scale interferometry, high- 
energy particle accelerators, and deep-sea neutrino detectors, to space-based observatories 
such as Hubble, Wilkinson microwave anisotropy probe (WMAP) and Planck. The recent 
and stunning detection of massive gravitational encounters between black holes by means 
of gravitational waves is only one of several windows that have recently been opened into 
the study of distant and exotic objects, and to ever-earlier epochs of the universe. 

Cosmology is in a golden age of discovery, the likes of which have rarely been seen in 
the physical sciences. Theory has hardly kept up, but its bringing together of the fundamen- 
tal theories of physics is also historic in its vitality. It draws them together in ways that put 
pressure on each: whether because, as in quantum mechanics, cosmology is an application 
in which there is no ‘external observer’; or because, as with the standard model of particle 
physics and general relativity, there is tension between their basic principles; or because, 
as in statistical mechanics, it highlights the extraordinary importance of the initial condi- 
tions of the universe for local physics. Add to these components the existing foundational 
problems of each discipline even in non-cosmological settings: the measurement problem 
in quantum mechanics, the ‘naturalness’ problem of the Higgs mass and the cosmologi- 
cal constant or ‘dark energy’ (so-called ‘fine-tuning’ problems), and the information-loss 
paradox of black-hole physics. The result is a heady brew — and this is not even to mention 
the enigma that is dark matter, making up the bulk of the gravitating matter of the universe, 
its nature still unknown. 

What place, in this perfect storm, for philosophy? Some see none: “philosophy is dead’, 
according to Stephen Hawking, and needs no engagement from scientists. And indeed, 
where philosophers of physics have made inroads on conceptual questions in physics, they 
have tended to focus on cleanly defined theories treated in isolation. Synthetic theories, 
in complex applications, are messy and ill suited to rigorous analysis, axiomatisation, or 
regimentation by other means — the standard tools of philosophy. And yet, time and again 
scientists ignorant of philosophy go on to do it anyway, badly. Philosophical questions are 
natural to the growing child and beset anyone with an enquiring mind; they are suppressed 
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at best by ring-fencing, at worst by decree. And cosmology has long been a testing ground 
of philosophy. According to some, its central domain is ‘the problem of cosmology’; that 
is the problem of understanding the world, including ourselves, and our knowledge, as part 
of the world, to echo Karl Popper. 

For all of that, ‘philosophy of cosmology’ as a body of philosophical literature engaged 
with contemporary cosmology does not yet exist. This book marks a beginning. In it we 
have gathered essays edging out from cosmological problems to philosophy, and from 
philosophical problems to cosmology; it is at the points at which they meet that we hope 
for the greatest synergies. Both disciplines are at a turning point: cosmology, in virtue 
of the greatly accelerated rate of data acquisition that brings with it a new set of funda- 
mental problems with few or no precursors in any of the empirical sciences; philosophy 
of physics, in virtue of significant progress in the last two decades on the foundations of 
quantum mechanics, in particular an understanding of quantum theory as a realistic the- 
ory applicable to the universe as a whole. Add to this new perspectives on the meaning of 
physical probability, the role of probability in the foundations of statistical mechanics, and 
the significance of the initial state of the universe for local physics and the arrow of time. 

In their contributions to the book, philosophers explore still wider concerns in epistemol- 
ogy, metaphysics, and philosophy of mathematics. We have urged them, just as contributors 
working in cosmology proper, to push in the least comfortable directions and to expose 
rather than conceal what is conceptually obscure in their undertakings. As editors, we are 
united in the view that the hard problems of cosmology should be thrown into as sharp a 
relief as possible, and in the simplest terms possible, if there is to be any hope of overcom- 
ing them. It is to that end that this book, and the research programmes from which it was 
created, were conceived. 

We thank the John Templeton Foundation for a grant that enabled the preparation of this 
volume by supporting a series of workshops and a conference. 
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Part I 
Issues in the Philosophy of Cosmology 


1 


The Domain of Cosmology and the Testing of 
Cosmological Theories 


GEORGE F. R. ELLIS 


This chapter is about foundational themes underlying the scientific study of cosmology: 


e What issues will a theory of cosmology deal with? 

e What kinds of causation will be taken into account as we consider the relation between 
chance and necessity in cosmology? 

e What kinds of data and arguments will we use to test theories, when they stretch beyond 
the bounds of observational probing? 

e Should we weaken the need for testing and move to a post-empirical phase, as some have 
suggested? 


These are philosophical issues at the foundation of the cosmological enterprise. The answer 
may be obvious or taken for granted by scientists in many cases, and so seem hardly 
worth mentioning; but that has demonstrably led to some questionable statements about 
what is reliably known about cosmology, particularly in popular books and public state- 
ments. The premise of this chapter is that it is better to carefully think these issues through 
and make them explicit, rather than having unexamined assumed views about them shap- 
ing cosmological theories and their public presentation. Thus, as in other subjects, being 
philosophical about what is being undertaken will help clarify practice in the area. 

The basic enterprise of cosmology is to use tested physical theories to understand major 
aspects of the universe in which we live, as observed by telescopes of all kinds. The foun- 
dational issue arising is the uniqueness of the universe [66, 27, 28]. Standard methods of 
scientific theory testing rely on comparing similar objects to determine regularities, so they 
cannot easily be applied in the cosmological context, where there is no other similar object 
to use in any comparison. We have to extrapolate from aspects of the universe to hypothe- 
ses about the seen and unseen universe as a whole. Furthermore, physical explanations of 
developments in the very early universe depend on extrapolating physical theories beyond 
the bounds where they can be tested in a laboratory or particle collider. Hence philosophi- 
cal issues of necessity arise as we push the theory beyond its testable limits. Making them 
explicit clarifies what is being done and illuminates issues that need careful attention. 

The nature of any proposed cosmological theory is characterized by a set of features 
which each raise philosophical issues: 
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. Scope and goals of the theory 

. Nature of the theory: kinds of causality/explanation envisioned 

. Priors of the theory: the range of alternatives set at the outset 

. Outcomes of the theory: what is claimed to be established or explained 
. Data for the theory: what is measured 

. Limits to testing of outcomes 

. Theory, data, and the limits of science 


NNN fF WN KE 


The following sections will look at each in turn. 


1.1 Scope and Goals of the Theory 


I distinguish here between the physical subject of cosmology, and the wider concept of 
cosmologia, which also entertains philosophical issues. 


1.1.1 Cosmology 


I define “cosmology” as theory dealing with physical cosmology and related mathemat- 
ical and physical issues. Specifically, it proposes and tests mathematical theories for the 
physical universe on a large scale, and for structure formation in that context. It is a purely 
scientific exercise, relating to the hot big bang theory of the background model of cos- 
mology, probably preceded by an earlier phase of exponential expansion, to theories of 
structure formation in this context, and to the many observational tests of these propos- 
als (see e.g. Silk [88], Dodelson [24], Mukhanov [69], Durrer [25], Ellis, Maartens and 
MacCallum [31], Peter and Uzan [81]). It is a very successful application of general rela- 
tivity theory to the large-scale structure of spacetime in a suitable geometrical and physical 
context. However, because of the exceptional nature of cosmology as a science, it pushes 
science to the limits, particularly when considering what can be said scientifically about 
the origins of the universe, or existence of a multiverse. 


1.1.2 Cosmologia 


By contrast, ““cosmologia” considers all this, but in addition deals in one way or another 
with the major themes of the origin of life and the nature of existence that are raised because 
the physical universe (characterised by “cosmology’’) is the context for our physical being. 
It does so in a way compatible with what we know about physical cosmology and physics. 
Cosmologia necessarily considers major themes in philosophy and metaphysics, perhaps 
relating them to issues of meaning and purpose in our lives, and is of great public interest 
and concern. 


1.1.3, Which are we Studying? 


The major point to make here is that theorists tackling cosmological issues need to make 
clear which topic they are dealing with: cosmology, or cosmologia. Either is a legitimate 
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topic for investigation, and one can choose which to tackle, but it should be very clear 
which is the topic of discussion. What is not legitimate is to use only the methods of 
cosmology, and claim to solve problems of cosmologia. If one wants to tackle cosmologia, 
one must use adequate methods to do so, involving an adequate scope of enquiry, methods, 
and data, as discussed further below. 

Now one may think the latter (““cosmologia”) is not what academic cosmologists should 
be dealing with; and indeed most groups dedicated to cosmology in physics or astronomy 
departments restrict themselves to cosmology as defined here. However, some scien- 
tists studying cosmology in such departments are proclaiming confidently about issues 
of cosmologia in highly publicized books and lectures (e.g. Susskind [92], Hawking and 
Mlodinow [52], Krauss [57]). Thus in these cases the drive to consider these broader issues 
comes from cosmologists themselves. 

The theme of this section can be stated as: 


Issue 1: We can legitimately consider either the enterprise of cosmology as defined here, or extend 
our investigations to the further issues referred to in cosmologia; but we must make a clear decision 
as to which stance we are taking, and then adhere to this clearly and consistently in our work. The 
scope of the theory proposed should be made very clear at the outset. 


In this way, workers in the field can make quite clear if they are engaged in a purely 
scientific enterprise of cosmology, or are also entering the philosophical and metaphysical 
waters embodied in cosmologia, commenting on issues of meaning and purpose as well as 
discussing issues in physical cosmology as evidenced by astronomical observations. 


1.2 Kinds of Causality/Explanation Envisioned 


Given a statement of scope of the theory, the next issue is what kind of causality will 
be taken into account. What kinds of causal influences are assumed to be possible in the 
universe? How do they relate to the great issues of chance, necessity, and purpose? 

Assuming we are dealing with cosmology rather than cosmologia, the basic underlying 
assumption is that what happens at the cosmic scale is the outcome of the interplay of 
chance and necessity alone. The fundamental problem is that in the cosmological context 
of a unique universe, the difference between them is blurred. 


1.2.1 Necessity 


The inexorable nature of necessity is taken to be embodied in fixed and immutable physical 
laws, expressed in mathematical form [76]. They are taken to be eternal and unchanging, 
being the same everywhere in the universe at all times and places. The nature of the causal 
laws proposed (variational principles, symmetry principles, equations of state, and so on) 
characterizes the causal factors in action in the physical universe that underlie necessity. 
Sometimes it is suggested that the laws of physics change with time or place, for example 
through variation of constants such as the gravitational constant G [96], or choice of a 
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string theory vacuum; but if so, a higher set of laws will be proposed that determine how 
this happens, for example laws determining how G changes with time, or the laws of string 
theory that show how string vacua affect local physics. If this is not done, we have no 
ability to predict physical outcomes scientifically. If this is done, then it is this higher-level 
set of laws that are the fundamental unchanging ones that govern what happens: in essence 
one’s first stab at finding the unchanging laws was wrong, but that does not mean they do 
not exist: the higher level set are of this nature. 


The Nature of the Laws of Physics 


The laws of physics are not physical entities; they are abstract relations characterizing the 
behavior of matter or fields. It is not clear if the laws are prescriptive, controlling what 
happens, or descriptive, describing what happens. If they are prescriptive, where or in what 
way do they exist? How do they get their causal power? If they are descriptive, some- 
thing else controls how matter behaves: what is it, and how does it get its causal power? 
And then, why does matter everywhere behave in the same way, so that it is described by 
mathematical laws? 


Possibility Spaces 


It is not clear how to obtain traction in considering these issues. A proposal that to some 
degree sidesteps them is the idea that possibility spaces are the best way to describe the 
effects of physical laws. Deterministic laws £ are associated with possibility spaces Qp that 
delimit their possible outcomes, and express the nature of necessity by characterizing what 
is and what is not possible. In effect, we characterize causality not by the nature of the laws 
themselves, but by the nature of their solution spaces, which strictly constrain what is pos- 
sible in the physical world. These include phase spaces in classical theory, Hilbert spaces 
in quantum theory, the landscape of string theory, and so on. Possibility spaces are hier- 
archically structured, with multiple layers of description and effective behavior depending 
on the level of averaging used. Constraints such as conservation laws characterize allow- 
able trajectories in these spaces, and so largely define the dynamics (e.g. in Hamiltonian 
systems). 


1.2.2 Ontology and Epistemology 


A distinction is crucial here. Possibility spaces themselves exist as unchanging abstract 
(Platonic) spaces &2p limiting all possible structures and motions of physical systems. They 
are the same at all places and times. Our knowledge of them however is a representation of 
that space that is changing with time. That is, we represent (2p by some projection: 


P(t): Qp > Ep aa 


into a representation space Ep where P(t) depends on the representation we use, and 
changes with time. This does not mean that physics itself, or the possibilities it allows, 
are changing: it is just that our knowledge of it is changing with time. Ontology (what 
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possibilities exist, as a matter of fact) is entailed by the nature of Qp. Epistemology (what 
we know about it) is determined by the projection P(t). The representation space Ep will 
be represented via some coordinate system and set of units, which can be altered without 
changing the nature of the entities being represented. Such coordinate changes therefore 
represent symmetries in the space of possible representations. 


The Nature of the Laws 


The fundamental issue for cosmology is two-fold. Firstly, 


e What kinds of causal laws £ and associated possibility spaces Qp hold in the universe? 
What are their properties? 


That is a scientific issue, determined by experiment and observation. According to our 
current understanding, the laws of physics are locally describable in terms of suitable 
mathematical equations. They will involve: 


e a description of the variables involved and their interactions, governed by specific 
charges and masses, 

e associated variational principles, leading to dynamic equations, 

e symmetry principles and associated conservation laws and constraints, 

e aspecification of the geometry and number of dimensions of the space in which this all 
happens. 


This leads to appropriate partial differential equations, such as Maxwell’s equations, Yang— 
Mills equations, the Einstein field equations, the Dirac equation, and so on, that can be 
used to calculate the outcome of the application of the laws, given suitable initial data. 
There will be alternative ways of expressing these laws, for example it may be possible 
to express them in Hamiltonian or Lagrangian form, or as path integrals, or as partial 
differential equations, or as integral equations. Given suitable constraints on the physical 
situation considered, the partial differential equations may reduce to families of ordinary 
differential equations for suitable variables, which can be expressed in dynamical systems 
terms. 


Why do they have their Specific Nature? 
Secondly, 


e What underlies the existence of the specific laws £ that hold and associated physical 
possibility spaces (2p? Why do they have the nature they have? 


This is a philosophical issue, because there is no experiment we can do to test this feature: 
we can test what characteristics they have, indeed this is what we do when we determine the 
effective laws of physics, but we cannot test why they have this nature or what underpins 
their existence. As far as physical cosmology is concerned, the standard assumption is that 
their nature, having been determined in some unknown way before the universe began, 
cannot be different. They just are what they are. 
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1.2.3. Contingency 


Given the laws of physics, the outcome depends on a set of boundary conditions B for the 
relevant equations. These are contingent in that, unlike the laws of physics, they could have 
been otherwise. This is the essential difference between laws and boundary conditions. 
Then one has to ask what fixes the specific boundary conditions that actually occurred. In 
the past, the assumption was that it was fixed by a symmetry principle (the “Cosmological 
Principle”) [11]. The current tendency is to assume they are fixed by chance, that is, some 
random process determines them [46]. The outcome (the universe that actually comes into 
being) is then fixed by the deterministic laws £ that map (p into itself, giving a unique 
outcome from the initial data: 
L(B) : Qp > Qp (1.2) 


with different outcomes for different initial data B. 


1.2.4 The Relation Between Them 


But what does contingency mean in the context of the universe as a whole? Is it a mean- 
ingful concept? If so how do we cash it out? The difference is clear in the case of systems 
situated in the universe, but problematic for the universe itself. 

The problem is that as we have only one universe to observe, so we have only one set of 
initial conditions to relate to; and no way to test if they could indeed have been different. 
All we experience is that one specific set of initial conditions has indeed occurred. It could 
be that only one set of initial conditions is possible: in which case their value is fixed by 
a law, not by a choice among a number of possible different initial states, which is the 
essence of the idea of contingency. 

Are there laws for the cosmos as a whole? This is highly disputed territory. In the 1960s— 
1970s the idea of a Cosmological Principle was proposed [11], which in effect is a law of 
initial conditions for cosmology, determining what spacetimes actually occur out of all 
those that are possible. It determines the subspace C of realizable cosmologies out of the 
set of all possibilities Qp. Note that this is still an abstract set of possibilities, as it will 
be a family of space-times rather than only a specific one with all parameters determined, 
assumed to lead somehow to one or other actually existing universe out of these possibil- 
ities. More recently the proposal that there is a law of initial conditions for the universe 
as a whole has been made by some authors on the basis of various ideas about quantum 
cosmology (see Hartle [48]). But the concept of a law that applies to only one object is 
in conflict with the essential nature of a law, namely that it applies to a class of similar 
objects. To give it meaning one in effect has to introduce an actually existing ensemble € 
of universes out of those in C, thus denying the uniqueness of the universe, together with an 
explicit or implicit probability measure Mg on this ensemble, allowing one to talk about 
chance in this context. 

Three problems occur. First, these proposals are plagued by infinities that tend to occur 
in such ensembles. Thus the outcome of using a proposed measure may be ill defined. 
Second, while we can argue for specific such measures Mg on various grounds, we cannot 
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test any proposed measure by observation or experiment, because we cannot check the 
outcomes as regards numbers of universes that occur with the predicted frequencies. It 
is therefore an untestable choice, with the outcome determined by a priori philosophical 
assumptions (such as a “principle of mediocrity” [99]). In practice, measures on the spaces 
E or C of cosmological models are chosen to give desired results: specifically, trying to 
make the one universe we can see appear to be probable. Finally, one has to clarify if 
this measure is proposed as relating to a physically existing ensemble of universes, or a 
conceptual one. If the former, how do we show it exists? If the latter, how does it influence 
what actually happens? 


Issue 2: How can there be a meaningful difference between chance (i.e. contingent boundary condi- 
tions for cosmology) and necessity (the deterministic physical laws that act in an inevitable way, as 
characterized by possibility spaces) in the context of the existence of the unique single universe we 
probe by astronomical observations and physical experiments? 


The distinction seems rather arbitrary. We cannot establish the difference observationally 
or experimentally. Nevertheless we usually assume there is such a difference, as that is how 
ordinary physics works. 


1.2.5 Creation of the Universe 


These issues come to a head in terms of theories of creation of the universe (e.g. Hartle and 
Hawking [49]). The problem with theories of creation of the universe is that we can only 
proceed by applying physical ideas that we determine within the universe, and extrapolate 
them to applying to the universe itself. It is not clear that makes sense, inter alia because 
there is no concept of space or time before the universe exists, and certainly there is no way 
to test their validity (we cannot rerun the project and try with varied starting conditions, 
for example). 

To additionally propose that one has a theory of creation of the universe “from nothing” 
(e.g. Krauss [57]) does not make scientific sense, for one can only develop a creation theory 
by assuming the nature of the laws to be applied: quantum field theory, for example [80], 
perhaps extended to the standard model of particle physics, or something like it. These are 
assumed to causally precede the coming into being of the physical universe. Calling that 
“nothing” is sleight of hand [4]: it is a very complex structure that is assumed to in some 
sense pre-exist the coming into being of the universe (for they are assumed to cause its 
existence). So where or in what sense do all these items exist prior to the existence of the 
universe itself? Presumably they are meant to inhabit some kind of Platonic space that pre- 
exists the universe, which somehow gains causal power over material entities even though 
space and time do not exist then. That is a very powerful form of existence. 

There is a lot that needs clarification here. Supposing this was indeed clarified into a 
consistent theory, it would not be a testable theory of how things happened, even if some 
proposed outcomes might be testable. 
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Figure 1.1 Irreducible randomness occurs in experimental quantum physics. Double-slit- 
experiment performed by Dr. Tonomura showing the build-up of an interference pattern of single 
electrons. The numbers of electrons are (b) 200, (c) 6,000, (d) 40,000, (e) 140,000 [from Wikimedia 
commons]. Quantum theory predicts the statistics of outcomes that eventually builds up, but cannot 
even in principle predict where each individual electron will arrive. 


1.2.6 Chance: Randomness 


Chance occurs within the universe, due to statistical interaction between emergent lev- 
els of structure. However, attractors in possibility spaces determine basins of attraction, 
and so relate chance and necessity: substantial variation of data can lead to the same out- 
comes, reducing the effect of randomness. In the case of strange attractors, the effect is 
the opposite: outcomes are unpredictable in practice, even though they are predictable in 
principle. 


Irreducible Quantum Uncertainty 


However, in addition, irreducible quantum uncertainty occurs in the universe due to 
quantum physics effects: the initial data do not determine the outcome uniquely, even 
in principle, because quantum physics underlies all physics. This is illustrated in the 
double-slit experiment outcomes shown in Figure 1.1. 

Assuming that standard quantum theory as tested by many experiments is right, there 
is no way even in principle of determining where each individual photon will end up on 
the screen after passing through a very narrow double slit, although the statistics of the 
outcomes are fully determined by the Schrédinger equation [37, 55, 108]. 
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Now it is rather strange that many working physicists talk about quantum physics as if 
it were a deterministic theory. This is because the Schrddinger equation is a determinis- 
tic equation, uniquely determining the evolution of the wave function Y from initial data. 
But the value of W does not determine specific physical outcomes: it determines only the 
statistics of these outcomes [37, 38, 55]. When a measurement takes place, the unitary 
Schrédinger equation does not apply: a superposition of eigenstates collapses to a single 
eigenstate (see e.g. Zettli [108]: 158), which is a non-unitary change. The philosophy seems 
to be that as the behavior of an ensemble of events, and in particular the statistics of associ- 
ated energy changes and scattering angles, is uniquely determined, quantum mechanics is 
a deterministic theory. But there is no ensemble to consider without individual events! (as 
is seen in Figure 1.1): and irreducible uncertainty applies to the individual events. Physics 
is in fact unable to predict unique specific outcomes from initial data: it does not (at the 
micro level) determine the particular individual things that happen. 

Various attempts have been made to show this is not the case, e.g. the Many Worlds 
(Everett) interpretation of quantum physics [55], and the Bohm Pilot Wave Theory [8]. 
In these cases one has a deterministic (unitary) outcome taking place behind the scenes of 
what is actually experienced by physicists working in a laboratory. None of these proposals 
alters the fact that experiments definitively demonstrate that we actually experience irre- 
ducible quantum uncertainty in a laboratory as we try to predict the outcomes for specific 
individual events from measurable initial data [37, 108]. 


Unpredictability in Cosmology 


Now one might think that this is irrelevant to physics at a cosmological scale, because 
this only happens at microscales. However, if the current standard model of cosmology 
(see Section 1.4) is correct, this is not the case, because in this theory, quantum fluctu- 
ations during the inflationary era are amplified to macroscopic scales by the exponential 
expansion that occurs, and then result in the seed perturbations on the last scattering sur- 
face that lead to the formation of clusters of galaxies and galaxies by a bottom-up process 
(small-scale structures form first, and then assemble to form larger scale structures later). 

This means that if we knew everything measurable about the universe at the start of 
inflation, this would not determine the specific individual astronomical structures that we 
can observe in the universe at later times. The statistics of these structures is predictable and 
indeed is used as a test of cosmological theory [24, 69, 25, 31, 81]. The existence of specific 
individual structures such as our own galaxy is not caused by the quantum processes in 
operation, for this is the nature of quantum physics as has been conclusively determined 
by laboratory experiment, as shown in Figure 1.1: it is subject to irreducible uncertainty. 
Given complete data at the start of the universe, cosmological theory cannot predict the 
specific structures that will later come into being. The conclusion is that necessity does not 
always apply in the universe, even at cosmological scales. However, the statistics of what 
happens is uniquely determined. 
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1.2.7 The Origin of Life in the Universe 


It is a moot issue as to whether cosmology should be considered as extending to discussion 
of the origins of life in the universe. Cosmologia must certainly do so — it is central to 
that enterprise. The viewpoint suggested here will be that the origin of life lies outside the 
domain of cosmology proper, as it raises so many new kinds of issues to do with the nature 
of causation in biological systems that are unrelated to the gravitational and astrophysical 
issues attended to by cosmology. 

In particular, a key feature of life is that all biological structures have a function or 
purpose (Hartwell et al. [50]), and that must be taken into account in an adequate way if 
one enters this territory. Life also crucially involves the idea of adaptive selection to the 
social and ecological environment. It is dealing with other kinds of causal processes than 
are considered by physical cosmology. They certainly do take place in the universe as we 
know it, but physics per se does not consider them. 


1.3 Priors of the Theory 


The priors of the theory are the range of alternatives set at the outset. Given an understand- 
ing of the relation between chance and necessity, the next issue is the range of causal and 
contingent priors: what is presumed to exist at the start of the evolution of the universe? 
What range of alternatives will be considered for necessity, and for contingent elements? 
Only if we consider sufficient alternatives can we consider how robust our theory is in the 
space of theories. 


1.3.1 Necessity: the Prior Alternatives Considered 


The possibility spaces specified in a theory depend on the range of alternatives we are 
prepared to consider, which are defined by the priors of that theory. 

What range of physical laws or regularities is assumed to hold to embody necessity? For 
example: 


e What range of theories of gravitation will we consider? — general relativity, scalar-tensor 
theories, f(R) theories, bimetric theories, conformal theories, fourth-order theories, 
theories with torsion, and so on (see e.g. Gair et al. [42]). 

e What kinds of theories of dark energy will be included, as well as a cosmological 
constant? Will they for example include quintessence, phantom energy, or chameleon 
theories? 

e What possibilities will we take into account for dark matter? MACHOs, WIMPS, axions, 
supersymmetric partners of known particles? 

e What inflaton possibilities are included? Single field, multiple field, torsion? Will they 
for example include a non-minimally coupled Higgs field? 

e Will we look at alternatives to inflation? — will we consider pre-big bang models, string 
gas cosmology, any of the various cyclic models? 
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e Will we look at versions of inflationary theory without an initial singularity [30, 23] 
(which are possible, because they do not obey the energy conditions needed for the 
singularity theorems [51])? 

e Are fundamental constants to be taken as priors, or to be determined by the theory? Will 
they be allowed to vary in time [96]? 

e If one is to attempt a theory of creation of the universe, what will be the prior assump- 
tions? What quantum gravity options will we explore? (string theory, loop quantum 
gravity, causal set theory, something else [67])? Do we assume physical laws preceded 
the existence of the universe (in causal terms), or came into being with the universe? 

e Will we propose a model starting from “nothing”? If so, what is “nothing”? Can 
“nothing” exist? Can it cause anything? 


Overall, there are many possibilities. To check the uniqueness and robustness of the stan- 
dard model in the space of all models, we should check alternatives; we cannot test 
alternatives unless we consider them. 


1.3.2 Geometry and Topology: the Priors Considered 


Similarly, what range of initial or boundary conditions for the theory will be taken into 
account in order to establish its robustness? 

There is a larger range of contingent possibilities (initial conditions) than often 
considered [31]: 


e Will we consider alternative spatial topologies [59], or only universes with simply 
connected 3-spaces? There is an infinite number of topologies for the case k = —1. 

e In particular, will we consider small universes, that we could have seen around since 
decoupling [32, 21]? These are the only universes where we can see all that exists in the 
universe, and perhaps even see our own galaxy as it was in the past. 

e Will we consider models with positive and negative spatial curvature? Many papers do 
not, but positive curvature makes a huge difference as regards possibilities in the future 
and the past [84], as well as suggesting a finite universe and simplifying boundary value 
problems. 

e Will we consider anisotropic modes (as in Bianchi spatially homogeneous models 
[103])? If one believes in generic conditions, they will be there too. 

e Will we consider inhomogeneous (Lemaitre-Tolman) models [10], where the Coper- 
nican principle does not hold? These have the potential to explain the supernova 
magnitude-redshift relations without the need for any dark energy or cosmological 
constant [16]. One can only see how viable they are by examining their observational 
implications. 

e Will we consider models based on voids, such as the Lindquist/Wheeler models, where 
the large-scale homogeneity is not assumed from the outset but rather is derived as an 
averaged outcome of masses imbedded in vacuole regions [20]? 


14 George F. R. Ellis 


We need some foundation to proceed as the basis for adequately studying the relation 
of chance and necessity. As an example, it is somewhat paradoxical that while inflation 
is claimed to solve the smoothness problem, almost all studies only consider almost- 
Friedmann—Lemaitre—Robertson-Walker (FLRW) models — which start off extremely 
smooth. 


1.3.3 Priors for Studying Cosmologia 


Extending the scope of consideration to cosmologia will imply consideration of priors 
concerning a range of further issues, for example the existence of possibility spaces to do 
with life and thought. Thus these may include considering as priors. 

Possibility spaces (2, for the existence of life, for example those discussed by Wagner 
[101]. If our studies extend to the existence of life, we must necessarily be concerned 
about the prior conditions that allow life to exist. The possibility of life existing is an 
eternal unchanging feature of the universe, depending on the nature of the laws of physics 
and their relation to biological possibilities; whether it actually comes into existence is a 
separate issue, based on contingent features related to the nature of the particular universe 
or universes that happen to come into existence. 

Abstract Platonic spaces related to the logical operation of the mind, for example 
a Platonic world Qy of mathematical abstractions (Changeux and Connes [17], Penrose 
[75]), learnt about by the human mind through its neural network structure (Churchland 
[18]), and inter alia underlying the practice of theoretical physics. The point is that key 
parts of mathematics are discovered rather than invented, e.g. the fact that the square root 
of 2 is an irrational number, the value of z, and Pythagoras’ theorem. These are abstract 
truths that are independent of the existence and culture of human beings, and pre-exist the 
existence of life, perhaps even of the universe itself. While they need not be considered by 
cosmology, cosmologia must consider them too, for they are part of what exists and helps 
shape human thoughts and culture. 

In each case the distinction between what actually exists (the Platonic reality) and what 
we know about them (our representations of this reality), as discussed in Section 2.1, will 
hold. Why they exist with the form they have, and the nature of their relation to possibility 
spaces for physical laws, is a deep philosophical mystery. 


1.4 Theoretical Outcomes 


The standard theory gives a set of outcomes of explanatory and descriptive models that are 
generally agreed on, plus some further claims that are more controversial. 


1.4.1 Outcomes of the Theory: models 


Outcomes include models of what exists at a cosmological scale, claims about physical 
causes of their existence in terms of effects of forces acting on matter, claims about having 
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tested these statements through astronomical observations as well as laboratory or collider 
experiments where relevant, and claims of unifications of previously separate domains and 
results to create such models. 

The basic model [24, 25, 31, 81] is a model of the smoothed-out geometry of the 
universe — usually a Robertson—Walker spacetime — and its time evolution, driven by 
matter, radiation, and possibly a cosmological constant. It extends to including astrophys- 
ical and physical aspects of the early universe associated with the hot big bang era such 
as baryosynthesis, nucleosynthesis, a radiation-dominated to matter-dominated transition, 
and decoupling of matter and radiation leading to relic blackbody cosmic radiation. It will 
usually include an even earlier inflationary era driven by one or more effective scalar fields, 
and a later dark energy-dominated era. 


Outcome 1: Background Model 


The standard major outcome is the standard expanding universe first background FLRW 
model governed by gravity [51]. Its metric in comoving coordinates is: 


ds? = —dt? + a2(1){dr? + f2(r)dQ?} (1.3) 


where a(f) is the scale factor, f(r) is {sinr,r, sinh r} if the normalised spatial curvature k 
is {+1,0, —1}, and dQ? is a the metric of a unit 2-sphere [84]. The dynamics is very early 
inflation followed by a hot big bang radiation-dominated era, including nucleosynthesis, 
followed by decoupling and a matter-dominated era. At late times, dark energy, possibly 
a cosmological constant, dominates and causes an acceleration of the rate of expansion. 
Alternative models may allow for anisotropy [103] or inhomogeneity [10, 16, 20, 31]. 

Predictions: cosmological relics The standard model supplemented by physics and 
astrophysics firstly predicts what can be called geological data: that is, data about present- 
day entities that relates to events on our world line in the far distant past [26]. These include 
for example element abundances due to primordial nucleosynthesis [71, 89] followed by 
stellar nucleosynthesis. These all put constraints on the background cosmology. Then there 
is the relic background radiation; but that is detected by telescopes, so is dealt with in the 
next paragraph. 

Predictions: astronomical data Secondly, there are astronomical data gathered from 
light emitted on our past light cone [58, 26]. The standard model predicts area distance 
versus redshift relations which are observable via standard candles or standard-size objects 
[87], together with number count versus redshift relations for objects whose observable 
numbers are related to the density of matter in the universe by a standard bias factor. It also 
predicts cosmic blackbody radiation as a relic of the hot big bang era, with a very accurate 
blackbody spectrum [39] and complex angular power spectrum [25]. 

These predictions entail a number of model parameters [81, 1, 60] such as the Hubble 
constant, the density of baryonic matter, the density of cold dark matter, and the value of 
the cosmological constant, that are not fixed by the theory but rather are observationally 
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Figure 1.2 The expansion history of the universe (from: WMAP website http://map.gsfc.nasa. 
gov/.). 


determined. Extended models predict anisotropies that can occur in the model [58] or geo- 
metrical effects such as transverse versus tangential Hubble parameter and redshift space 
distortions [64]. 


Outcome 2: Perturbed Models 


The further outcome is statistical predictions for perturbations about the background 
spacetime, giving a model of structure growth [24] and its effects on CBR anisotropies 
[86, 25]. 

The standard model proposes creation of seed fluctuations by quantum effects during 
inflation [69], followed by baryon-radiation acoustic oscillations in the hot big bang era, 
all leading to seed perturbations on the last scattering surface. These then lead to growth 
of inhomogeneities by gravitational attraction acting on cold dark matter and baryons, 
modified by dark energy, providing a physical model for the emergence of large scale 
structures [81]. 

Predictions: The model predicts the statistics of matter (power spectra and correlation 
functions) together with a background radiation anisotropy power spectrum exhibiting a 
Sachs—Wolfe plateau at large angular scales and baryon acoustic oscillations (BAO) peaks 
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at smaller angular scales [24, 25, 81]. It has a basic parameter (the amplitude of per- 
turbations on the last scattering surface) that is not predicted by the theory but rather 
is observationally determined. Together with other parameters and observations such as 
the spectral tilt and the B-Mode polarization spectrum, it can be used to choose between 
various inflationary proposals. 

This is all summarized in the iconic Figure 1.2, provided by the WMAP team. From the 
inflationary era onwards, this model is tested by many kinds of data that fit together in a 
solid way to support it (see Section 5.5). However major issues remain. 


Outcome Limits 


There are substantial limits to the outcomes provided by the standard model of cosmology, 
within the domain of what it aims to explain. We do not have a unique model for inflation 
(we do not know what inflaton is), nor do we have a good understanding of what dark mat- 
ter is, nor do we know the nature of dark energy. We also do not have a good understanding 
of the quantum-to-classical transition that changes quantum fluctuations in the inflationary 
era into classical fluctuations at the surface of last scattering [79, 91], although there are 
some tentative models of this process [12]. 

Despite these lacuna, it gives a good model of the geometry and evolution of the universe 
and of structures in it, confirmed by the data (see Section 1.5 below). 


1.4.2 Proposed Outcomes: Extended Models 


The theory is extended by some workers in two different ways, that are not observationally 
testable in the same way as the standard model. 


Extension 1: The Start of the Universe 


A variety of models has been proposed for the start of the universe in terms of its origins 
from some kind of pre-existing state (perhaps a collapsing phase, or a state with positive 
definite signature, or a non-singular quantum gravity era), through some kind of laws of 
creation of the universe, or through some specific assumed boundary conditions (includ- 
ing the Hawking—Hartle no-boundary proposal [49]). These then provide conditions for 
a start to the inflationary era. There is some observational constraint on such proposals 
through their influence on the inflationary era, but the assumed geometry at that time can- 
not be observed and the physics or pre-physics presumed to hold at this time cannot be 
experimentally tested. 

However, whatever happens then is required to produce quite special initial condi- 
tions for inflation in relation to the space of all possible geometries and their associated 
gravitational entropy (Penrose [74]). As mentioned above, you cannot study this by only 
considering perturbed FLRW models: you have to take into account highly inhomoge- 
neous and anisotropic geometries, which greatly outnumber almost all FLRW models; and 
in most of them, inflation will never start, because anisotropy terms dominate the scalar 
field terms [85]. 
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Extension 2: The Universe Beyond the Visual Horizon 


Some workers make strong claims about what exists beyond the visual horizon; specifically 
there are claims of the existence of a multiverse of one kind or another [62, 107, 45, 94], for 
example eternal new inflation, claimed to lead to an infinite number of pocket universes, 
and eternal chaotic inflation can follow from specific choices of the inflaton potential [46]. 
Sometimes these may lead to bubble collisions which are in principle observable [40]; 
however many would not. It is also sometimes claimed that physical parameters such as the 
cosmological constant will be different in different bubbles; however this is an additional 
assumption, requiring a hypothesized mechanism for creating this difference. It does not 
follow from the geometry per se. And if it is assumed to follow from the landscape of 
string theory/M theory, this is of course an unconfirmed speculation about the nature of 
fundamental physics [7]. 

The physics presumed to lead to these results (i.e. the specific assumed inflaton poten- 
tials that underlie them) cannot be directly tested; however one can get some limits on 
them from their influence on structure formation and its effects on cosmic microwave 
background (CMB) anisotropies and polarization [65]. The associated measures that are 
claimed to make such results probable or improbable are not testable. 


1.4.3 Outcomes of the Theory: Unifications 


A major goal of science is to unify previously disparate areas, for example unifying the 
force causing apples to fall with that causing the Moon to orbit the Earth (Newton), or 
unifying in a single set of equations electricity and magnetism (Maxwell). 


Outcome 3: Unifications of Physics and Cosmology 


Thus a key further outcome of cosmological theory is unification of a series of physics 
theories with cosmological theory. The standard model unifies: 


e Gravitation on Earth and in the Solar System with the dynamics of the universe, leading 
to prediction of the time evolution of the universe [41, 84]; 

e Atomic physics ina laboratory with tight coupling in the hot big bang era and the process 
of decoupling, leading to prediction of the existence of cosmic blackbody background 
radiation [5, 71, 25]; 

e Nuclear physics in a reactor with the process of cosmological nucleosynthesis, leading 
to prediction of primordial element abundances [71, 102]. 


Limits on Unifications 
The intended unification of particle physics with early universe processes has however not 
come to fruition because the process of baryosynthesis is not yet linked to laboratory exper- 
iments, and the inflaton has not been identified with any known physical field, and hence 
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is not actually linked to established particle physics. The latter unification would however 
be achieved if the inflaton were the Higgs particle [35], as originally supposed [46]. 


1.5 Data for the Theory: Testing of Models 


A key issue is: what data are to be used to test the theory? Basic physics data are assumed 
as the foundation for the theory. The major data specifically for cosmology come from 
telescopes at all wavelengths, ground based and satellite based, and will eventually extend 
to gravitational wave detectors and perhaps neutrino telescopes. 


1.5.1 Laboratory Data and Solar System Tests 
These give data in two ways. 


The Nature of Particles and Forces 


Firstly, they explore the nature of the gravitational force that underlies cosmological 
dynamics [51]. They confirm that Einstein’s General Theory of Relativity is an excellent 
classical theory of gravitation, correctly predicting all solar system gravitational effects to 
high accuracy and they place limits on deviation from General Relativity Theory. 

They also explore the nature of particles that interact via those forces. In particular, dark 
matter is not ordinary baryonic matter, and its nature is unknown. Laboratory experiments 
and collisions at particle colliders constrain theories regarding dark matter in important 
ways, and might even lead to its discovery. 


Cosmic Relics 


Secondly, they confirm the nature of what presently exists as a result of the physics operat- 
ing in the early history of the universe: the existence of matter (leptons, baryons), photons, 
neutrinos, and of the elements that make up our existence (particularly carbon, hydrogen, 
nitrogen, oxygen, phosphorus) arising first from primordial nucleosynthesis and then from 
processing in stellar interiors. Relics from the past are evidence about the past, and so 
constrain our past thermal and physical history. Measurements of solar system and stellar 
abundances of elements confirm the standard picture of nucleosynthesis in a satisfactory 
way, up to some queries about the abundance of lithium that still need clarification [89]. 

The theory predicts some relics that are probably not directly observable: namely cos- 
mological neutrino and gravitational wave backgrounds. However, their indirect effects 
may be observable. 


1.5.2 Astronomical Data 


The specific astronomical data testing cosmological theories are [58, 31, 81]: 


e Galaxy and supernovae magnitude-redshift curves, 
e Source number counts as a function of redshift or luminosity, 
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e Matter power spectra, including a BAO peak, 

e Matter 2-point and 3-point angular correlation functions, 

e The CMB energy spectrum, particularly deviations from a blackbody form, 

e CMB angular power spectra, with a Sachs—Wolfe plateau and acoustic peaks, 
e CMB polarization power spectra, also with peaks, 

e Weak gravitational lensing observations, constraining dark matter. 


The analysis of data is made complex by peculiar velocities and associated redshift space 
distortions [78], as well as weak lensing of distant sources [104]. The CMB bispectrum is a 
probe of the non-Gaussianity of primordial perturbations. The astronomical data test both 
the background model and structure formation in that model. 


1.5.3 The Basic Model: Isotropy and Spatial Homogeneity 


The first question is whether we can show that the averaged geometry of the universe 
is indeed spatially homogeneous and isotropic, as represented by the FLRW family of 
models. 


Spatial Isotropy 


Isotropy of observations on large averaging scales is well established. In particular the 
matter distribution is isotropic on large enough averaging scales, and the CMB temperature 
is isotropic to one part in 10° once the dipole due to our motion relative to the universal 
rest frame is removed. 

This establishes that in comoving coordinates the metric can be written as: 


ds? = —f*(r,) dt? + a°(r,t{dr? + R*(t,r)dQ7} (1.4) 


with f(r,) = a(r,t) = 1 on the central world line r = 0. We are supposed to be close to 
that worldline. 


Spatial Homogeneity 


However, testing spatial homogeneity, that is, showing the metric in Eq. (1.4) can be 
reduced to Eq. (1.3), is more difficult, because we can only see down one past light cone. 
Nevertheless it is indeed testable in various ways [64]: 


Galaxy number counts suggest spatial homogeneity, as pointed out by Hubble and Pee- 
bles [72]. However, we observe distant galaxies with large look-back times, so their 
intrinsic luminosity may have been different at that time, and this compromises the 
test [68]. Indeed radio source number counts are incompatible with a Robertson—Walker 
geometry unless there is substantial evolution of either source numbers or luminosity. 
Good standard candles such as type Ia supernovae can be used to test spatial homogeneity 
through a null test independent of the gravitational theory and also of the equation of state 
of matter [19]. 
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e Time drift of cosmological redshifts can also be used to test spatial homogeneity [97], 
but the measurements are a long term project. 

e The most sensitive test however is via the kinematic Sunyaev—Zeldovich effect [110], 
which implies that deviations from spatial homogeneity cannot be large enough to 
explain the apparent acceleration of the universe at late times [64].! 


The latter data in particular give us good grounds for adopting an FLRW model for the 
visible universe on large scales. The Copernican principle does not have to be adopted as 
an a priori philosophical postulate, as was assumed in previous decades (see e.g. Bondi 
[11]): it is an observationally testable hypothesis. 


1.5.4 Characterizing Geometry: Perturbed FLRW Models 


Given that the universe is well described by an FLRW model, one wants the specific 
parameters describing the best-fit such model. 


The Parameters 


The key set of parameters describing these models is [1, 60]: 


. The Hubble constant, Hp = 100h km/s/Mpc. 

. The physical density of baryons Q;h7. 

. The physical density of cold dark matter Qgmh?. 

. The energy density of dark energy in units of the critical density, Qa. 

. The equation of state of dark energy w (w = —1 for a cosmological constant). 

. The reionization optical depth, t. 

. The power-law spectral index ny of primordial density (scalar) perturbations (n,; = | for 
scale-free primordial perturbations). 

8. The amplitude of primordial scalar curvature perturbations, As. 

9. The perturbation scalar to tensor ratio r. 
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Items 1-6 are properties of the background model, while items 7, 8, and 9 are to do with 
perturbations. However, tests of the background model, determining all these parameters, 
are best done via studying the matter power spectrum and CMB power spectrum and 
anisotropy power spectrum, which are properties of perturbations of the background model. 
One uses statistical fitting that determines all these parameters together (background and 
perturbations) from a combination of observations via many telescopes [24, 81, 1]. 

We normally assume the initial fluctuations are adiabatic and Gaussian, as predicted 
by simple inflationary theory [69]. One can add variables that test this assumption. Given 
these data, it puts limits on the nature of dark matter, of dark energy, and on the nature of 
the inflationary era. We can also test if we live in a small universe in various ways [32], but 
particularly by looking for identical circles in the CMB sky [21]. This adds up to a detailed 
model of physical cosmology that is well tested by observations. 


! This is in effect an implementation of the Almost Ehlers—Geren—Sachs theorem [90]. 
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Outcome: The standard model of cosmology is well tested within the observational horizons. Its 
basic parameters are well constrained by observations, as is the existence of dark matter and dark 
energy, and observations also constrain the kinds of inflationary epoch that might have occurred. 
However, it has problems to do with the nature of the inflaton, dark matter, and dark energy. None of 
these has been identified. 


Thus the standard model may be regarded as observationally well established, but with 
significant unknown aspects of important dynamical elements. 


1.5.5 Predictions Made 


The gold standard of scientific theory is predicting the results of observations before they 
are made. Cosmology can indeed claim to have done that, albeit in a slightly tortuous way, 
in the following cases: 


1. The expansion of the universe was predicted by Friedmann in 1922 before it had been 
observationally tested [41]. It was independently predicted by Lemaitre in 1927 [61], 
with prediction of a linear redshift—distance relation, as well as presentation of some 
observational data supporting this idea, before Hubble’s famous paper in 1929 [54]. 

2. The existence of cosmic blackbody radiation was predicted by Alpher and Hermann in 
1948 [5], well before it was detected in 1965. The COBE and WMAP satellites verified 
the blackbody nature of the CBR spectrum with great precision; it has a temperature of 
2.72548 + 0.00057 K [39]. 

3. Acoustic peaks in the CMB power spectrum were predicted by Peebles and Yu [73] 
with details of the relation filled in by Bond and Efstathiou [9] and then many others. 
Recent satellite observations, e.g. by the Planck satellite [1], confirm these predictions 
in a very satisfactory way. 


The latter is particularly impressive, as the structure of the curves is determined by just a 
few parameters [81]. 


1.5.6 Relation to Cosmologia 


Clearly none of the above data or models have anything direct to do with understandings of 
purpose, meaning, and the existence of life, except weakly in terms of providing conditions 
for galaxies and stars to come into existence. While these are necessary for the existence 
of life, they are far from sufficient, and it says nothing about the nature of or concerns of 
any intelligence that may emerge. It is theory and data suited for the study of cosmology, 
not cosmologia. 


Issue 3: Adequate data The data considered must be adequate to support the theory developed, 
whether cosmology or cosmologia. If it is to do just with cosmology, the data discussed in Section 
1.5.5 will suffice. If it has to do with the existence of life and intelligence, it must include priors and 
data relevant to these areas too. 
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Extending the scope of consideration to cosmologia will imply consideration of data 
about a range of further issues, for example the existence of possibility spaces to do with 
life and thought mentioned in Section 1.3.3. However they are handled, it will look at data 
to do with intention and meaning and purpose as well as chance, at data to do with good and 
evil as well as impersonal laws of physics. It is a very different project from cosmology. 


1.6 Limits to Testing of Outcomes 


We can test outcomes by astronomical, physical, and geological data; how uniquely does 
this constrain the model? There are substantial limits to the testing of cosmological models 
firstly because of the uniqueness of the universe, and secondly because we observe it from 
just one spacetime event. 


1.6.1 The Basic Limitations 


The basic limitations result from the uniqueness of the Universe on the one hand and its 
vast size and age on the other. 


Uniqueness 


Firstly, the uniqueness of the universe [66, 27] prevents us from comparing it with other 
similar objects or re-running it again with other initial conditions. Hence we cannot do 
the kind of experimental testing that is possible for theories about other physical objects. 
However, we are able to test its unique history that in fact happened through cosmological 
relics, that is, “geological” kinds of data. 

This is quite different from experimental science, where we can set up experiments with 
different initial conditions to see the outcomes. It is also different from other historical 
sciences, such as the study of evolution of life and of continental drift. In those cases we 
can look at many items of data that are relevant to huge numbers of plants and animals 
affected by evolutionary theory in the first case, and the various continents and sea beds 
affected by continental drift in the second. Theory can thus be tested in different contexts, 
unlike the case of cosmology. 


Only One Viewpoint 


Secondly, because of the vast scale and age of the universe, on a cosmological scale we are 
unable to move from a single event in spacetime [26]. 

The Earth’s distance from the Sun is about 8 light minutes, a galaxy is about 50,000 light 
years in diameter, and the Hubble scale is about 10!° light years. The recorded history of 
the human race is about 107 of the age of the universe. Consequently we only see a two- 
dimensional projection of a four-dimensional spacetime, because (at a cosmological scale) 
we can only observe the universe from just one spacetime event “here and now” [58, 86, 26] 
by means of light travelling towards us at the speed of light. We can easily test isotropy, 
but we cannot easily determine how far away distant objects are, and so cannot so easily 
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test spatial homogeneity. That is why distance determination is so important in cosmology. 
Nevertheless we are able to observationally check for spatial homogeneity, as mentioned 
above, provided we observe objects whose time evolution is well constrained, or indirectly 
use the properties of CMB propagation in an inhomogeneous universe. 

Despite these limitations, we can reasonably justify the Standard Model of cosmology 
(24, 69, 31, 81] discussed above: 


e inflation occurs with generation of fluctuations, followed by reheating, 

e a hot big bang era with baryosynthesis, nucleosynthesis, and BAO, followed by 
decoupling, 

e a matter-dominated era and structure formation, eventually succeeded by a dark energy 
dominated era. 


This can be regarded as well-established scientific theory. However, two limitations 
are major hurdles to extending tests of the standard model further, as discussed in 
Sections 1.6.2 and 1.6.3. 

Additionally, as implied in Section 1.2.4, we face the problem of cosmic variance: we 
have theories that predict statistics of models but we see only one spacetime, so we may 
live in a universe which is an outlier in terms of the statistics of the models. On scales 
significantly smaller than the Hubble scale we can do the same measurement in different 
patches in the sky and average over the patches. On the Hubble scale we cannot do such an 
averaging: we have just one bubble we can view at that scale. 


1.6.2 The Physics Horizon 


The first issue is major limits on testing the relevant physics at early times due to energy 
limits on what can be done via particle colliders. The highest collision energies currently 
attainable at the large hadron collider (LHC) are 14 TeV = 14 x 10! eV. Suppose we 
could go 1,000 times higher to 10!° eV; we still could not experimentally examine the 
energy scale of inflation predicted by the simplest models (10!> — 10!© GeV = 1074 — 107° 
eV), which is close to that expected for baryosynthesis (10!° GeV = 104 eV), much less 
determine the nature of what happened in the quantum gravity era, expected at the Planck 
scale of around 10!9 GeV = 1078 eV. 

This limit of about 10! eV may be dubbed the physics horizon [28]: one cannot exper- 
imentally determine the relevant physics at higher energies: it is not possible to build a 
particle collider on Earth that could do it. Thus unlike the case of nucleosynthesis, where 
the theory can be related to data from nuclear reactors on Earth, one has no way of exper- 
imentally determining the nature of what happened in the inflationary era — unless the 
inflaton is the Higgs particle, in which case this is indeed possible [35]. Otherwise the rel- 
evant physics is out of experimental reach and hence theories about what happened then 
are destined to remain untested speculation. The same applies even more so in the case of 
the quantum gravity era. 
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1.6.3 Causal and Visual Horizons 


Secondly, there is a causal horizon, the particle horizon [83, 51], and an observational 
horizon, the visual horizon [31]. 


Particle Horizon 


Anything we can detect must have arrived here at the speed of light, or a lesser speed. We 
can therefore only probe as far as our past light cone allows since the start of the universe. 
This limit — the particle horizon — is given at a time f by the comoving matter coordinate 
value r = Upp(t) where: 
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This value depends on what happens all the way back to the start of the universe, and is 
much larger when inflation takes place than in a non-inflationary universe. In an Einstein— 
de Sitter universe, Eq. (1.5) gives dyn (to) = 3/Ho. 

This implies that experimental or observational tests made here and now cannot probe 
causal influences arriving from distances greater than dp, (to). Structure formation studies 
that refer to scales greater than the “horizon scale” are referring to the Hubble scale at early 
times (when it was much smaller), not to the particle horizon at the present time. 


Visual Horizon 


Because we can only observe the universe by photons arriving at the speed of light, we 
can only see as far as our past light cone allows since the time of decoupling of matter and 
radiation (the universe was opaque before then), and we cannot see to earlier times. Thus 
what we can see is limited in both space and time. This visual horizon at a time ¢ [31] is 
given by the comoving matter coordinate value r = upp (1): 


t 
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where fgec 1s the time of decoupling of electromagnetic radiation. It is independent of what 
happened at earlier times, and so in particular is unaffected by whether inflation took place 
or not. It places major limits on the data obtainable from distant cosmological domains. 
This is indicated in Figure 1.3, where light cones are at +45° and the lower line is the 
surface of last scattering. The only matter in the universe that is observable is that whose 
world lines intersect our past light cone between now and the surface of last scattering. 
In an Einstein—de Sitter universe, it is almost the same as the particle horizon, but in an 
inflationary universe they are very different. The occurrence of inflation does not affect the 
visual horizon but affects the particle horizon. 

This limit applies to all forms of electromagnetic radiation. There are similar horizons 
for neutrino and gravitational wave observations, where tge- in Eq. (1.6) is replaced by the 
times of neutrino and graviton decoupling, respectively. No developments in technology 
will alter these limits. 
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Figure 1.3 Observable limits indicated in a Penrose diagram, where comoving coordinates are used 
and light rays are at +45°. The line at the bottom is the surface of last scattering; visual horizons are 
unaffected by what happens earlier. 


The implication is that anything we say about exterior regions, or matter in them, is 
unverifiable. The universe out there may be spatially homogeneous, or it may not. The mat- 
ter density may stay constant or rise or fall or oscillate. We could possibly be surrounded 
by a singular sphere at finite radius, or by an asymptotically flat empty domain. We can 
claim what we like, and no observation will ever be able to either confirm or disprove what 
we say [28]. 


1.6.4 Claimed Multiverses 


There are at present many claims for physical existence of various types of multiverse: vast 
numbers of universe domains, something like ours, but perhaps with varying parameters 
or even varying physics [82, 46, 107, 93, 45]. The assumption is that we can extrapolate 
from the observable domain to 100 Hubble radii, 10!9°9 Hubble radii, or much much more, 
and predict at least statistically what exists out there. Indeed the claim is often made that 
there is an infinite number of such other universe domains (“pocket universes”) out there, 
usually assumed to have different physics in each [13, 45, 95]. 

Now no one disputes that for a reasonable distance beyond the visual horizon, things are 
likely to look like what we see: in that sense other universe domains exist beyond what is 
visible. But the further out we make claims about what exists, the less justifiable they are 
(see Figure 1.3). No observational data whatever are available for claims about what hap- 
pens 10!9090000 Hubble radii away, as implied by these theories. To be sure, most of these 
theories only predict statistics of what is there, not what actually exists: but there is no way 
to test these statistical predictions. Thus multiverse claims are not directly observationally 
testable. 


Implied by Known Physics that Leads to Chaotic Inflation 


One way of justifying the multiverse claim is to propose it is the outcome of physics that 
can be regarded as well founded, and makes our observed universe domain probable. The 
two dominant ways of justifying this are the chaotic inflationary proposal where quantum 
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fluctuations in conjunction with a ¢7 potential lead to continual bubble formation [62], and 
the proposal of Coleman—de Luccia tunnelling between quantum vacua [2]. 

Now not all inflation is chaotic [63], so evidence for inflation is not necessarily evidence 
for chaotic inflation. The proposal of a ¢7 potential can be tested by its effect on CBR 
anisotropies, and the Planck data are against it [65]. The key physics for the tunnelling 
proposal (Coleman—de Luccia tunnelling) in false vacuum-driven inflation is extrapolated 
from known and tested physics to new contexts; the extrapolation is unverified and indeed 
is unverifiable; it may or may not be true. The physics is hypothetical rather than tested, as 
one might assume from some presentations. The situation is not 


{Known Physics} = Multiverse (1.7) 
but rather 
{Known Physics} = ?? = {Hypothetical Physics} => Multiverse (1.8) 


It is a great extrapolation from known physics. This extrapolation is untestable: it may or 
may not be correct. 

If we propose some mechanism for creating a multiverse, we can test it to a limited 
degree by observing what is in our particular domain, and claim that the multiverse makes 
probable what we see. But we cannot prove that what we see is indeed probable, because 
there is no unique measure that applies to such universes [44, 47]. Proposed measures 
suffer from divergence problems because of the infinities of universes they are supposed to 
apply to [63, 100]. Further, you cannot test a measure, precisely because we see only one 
universe domain. In the end, there is no proof the universe we see is probable. That is a 
philosophical assumption that may or may not be true. It is not a necessity that the universe 
is probable. 


The Anthropic Multiverse 


The problem of the fine tuning of the laws of physics that allows life to exist (the Anthropic 
question [15, 13, 82, 92]) has often been cited in favour of chaotic inflation with different 
physics or constants of physics in each bubble. 

In particular, this is often argued as regards the problem of vacuum energy [106, 82]. The 
Quantum Field Theory vacuum energy gives a huge value, discrepant with the cosmologi- 
cally observed value for A by 120 orders of magnitude if this vacuum energy gravitates, as 
is the obvious assumption. This is solved if A takes different values in each bubble universe 
in a multiverse, as might be suggested by string theory for example, for then life will only 
exist in bubbles with small values of A because these will be the only ones with galaxies 
and stars. Hence the small value of A is claimed to be the result of an anthropic selection 
effect in a physically existing multiverse. Weinberg [105] used this argument to predict a 
small positive value for A, which is an impressive achievement. 

This argument however assumes not just that different bubbles exist but also physics 
that allows different values of A, perhaps string theory with many vacua, together with a 
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mechanism that ensures different vacua are realised in different bubbles. The first is prob- 
lematic [7]; it is particularly this last step that is an untestable assumption about physics. 
Furthermore, the argument from A to the multiverse depends on there being no other way 
of solving the vacuum energy problem; otherwise it is one of several such proposals and 
one has to choose between them using philosophical principles, which cannot themselves 
be tested by a scientific process. 

A way round the vacuum energy dilemma used to motivate the multiverse is as follows: 
the vacuum does not gravitate if we use trace-free Einstein equations plus separate con- 
servation equations (an approach closely related to “unimodular gravity”). This works as 
follows [36]: the Einstein equations 


Rab — 5Reab + Agar = KT ap (1.9) 
(ten equations) imply energy conservation: 
ro =0 (1.10) 


It is plausible to assume A would include a vacuum energy term. If instead of Eq. (1.9), 
we take its trace-free part, we get 


Rap — GR8av = kK (Tab — {T8ab) (1.11) 


(9 equations). Assume Eq. (1.11) is the real gravitational equation, rather than Eq. (1.9); 
then the trace T of T,» does not gravitate and neither does A, so the vacuum energy has 
no handle on spacetime curvature. Now assume Eq. (1.10) holds separately, and using 
this, integrate the divergence of Eq. (1.11). This formally recovers Eq. (1.9), but now A 
is a constant of integration that has nothing to do with vacuum energy [106, 36]. Inflation 
works fine in this modified theory [29] (basically because the Klein Gordon equation for ¢ 
involves only the gradient of V(@), not its absolute value), and variation principles can be 
found for it [6]. This solves the major discrepancy between the quantum vacuum energy 
and the observed value of A that is otherwise a great problem for gravitational physics, 
without invoking a multiverse. 


Bubble Collisions 


In multiverse models based on tunnelling, there is a competition between the rate of nucle- 
ation of bubbles and the rate of expansion of the inflating region. If they are delicately 
balanced, there will be some bubble collisions but the bubble will largely remain sepa- 
rated, and inflaton will continue forever. If the expansion rate is rather higher, there will be 
no bubble collisions. If it is rather lower, all the bubbles will merge into one and inflation 
will come to an end. 

The one way of confirming the physical model that underlies chaotic inflation predic- 
tions is if we could observe the effects of such bubble collisions [43, 3]. It is proposed 
that such collisions would lead to anomalous circles in CBR anisotropy observations. This 
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would be pretty convincing of the anthropic multiverse proposal if some physical constant 
such as the fine structure constant or particularly the cosmological constant were differ- 
ent within such circles [70]. A further interesting proposal is to look for collisions via the 
kinetic Sunyaev Zel'dovich effect [109]. So looking for such traces is certainly worthwhile. 
However, only some multiverse models make this prediction, so not seeing them will not 
exclude multiverse models, just that subclass of models. If any of these effects were found 
that would be positive, but one would have to rigorously exclude other effects that could 
lead to them, which may be difficult. 


Cyclic/Bouncing Universe 


A variety of cyclic universes have been proposed that are in effect eternally existing mul- 
tiverses, with different universe bubbles being created again and again in time rather than 
occurring at the same time in space. This potentially may lead to the issue of eternal return. 

The problem here is that inflation effectively wipes out traces of previous eras [47], so 
again there is no easy proof of such claims. It has been claimed that they too might lead 
to circular patterns in the CMB sky, in the case of Penrose’ conformal cyclic cosmology 
[77], or specific patterns of perturbations resulting from previous collapsing phases, in a 
variety of bouncing models. However, these assume either some kind of generalization of 
general relativity, where a regular bounce can take place despite the probable growth of 
inhomogeneity and anisotropy during the collapse, or some form of hypothetical higher- 
dimensional “brane” structure. It is not clear that this physics can be tested. 


Implication of all the Above 


The multiverse proposal is not provable either by observation, or as an implication of well- 
established physics. It is under-determined by the data; in particular theories proposing 
how different physical parameters will necessarily occur in different bubbles, should those 
occur, are untested and indeed untestable. It may be true, but cannot be shown to be true 
by observation or experiment, although a subset of multiverse theories (those with bubble 
collisions) could possibly receive some limited observational support, by showing that a 
few bubble collisions have taken place. 

Thus it is not a part of testable scientific theory in the usual sense. However, it does 
have great explanatory power: it does provide an empirically based rationalization for fine 
tuning, developing from known physical principles. But even if we develop a multiverse 
theory which can explain in an anthropic way why these parameters have the values they 
have, this theoretical argument does not amount to a unique prediction, because multiverses 
allow anything to occur. 

Here one must distinguish between explanation and prediction. Successful scientific the- 
ories make specific predictions which can be tested. The multiverse theory cannot make 
any unique predictions because, unless one adds extra untestable philosophical assump- 
tions as an add-on, it can explain anything at all. Any theory that is so flexible is not 
testable because almost any observation can be accommodated. It is plausibly scientifically 
informed philosophy rather than science in the usual sense. 
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Note: we are concerned here with real physically existing multiverses, not with potential 
or hypothetical multiverses. Those exist as theories that we can consider and develop as we 
wish. It is the claim that they are physically realised that is the issue at stake. 


Multiverses and Cosmologia 


Because the multiverse is most often proposed in relation to the anthropic issue, it often 
is implicitly [82] or explicitly [92, 52, 57] phrased in that context; hence it is related to 
cosmologia rather than just cosmology. It is then stated to solve the philosophical problems 
of existence in a purely scientific way. 

However, the multiverse does not solve any of the fundamental problems of existence — 
they just recur in this new context: 


e Why the multiverse? 
e Why its laws? Why the specific constants associated with those laws? 
e Why is it a suitable home for life? 


To suggest it solves them is naive — it just postpones them. It is not a solution to the 
problems of cosmologia. 


1.6.5 Infinities 


A particular problem with multiverse theories is the often claimed existence of physically 
existing infinities of galaxies in them [47, 63]. This raises a more general issue: will phys- 
ically existing infinities occur in the theory? If so will they play an essential or inessential 
role? 

The key factor here is that infinity is not a very large number: it is an unattainable state, 
rather than an attainable state. It is not a very large number, it is an amount greater than 
any number that can exist. It is needed in mathematics, but cannot occur in physical reality. 
David Hilbert stated its relation to physics very clearly ([53]: 151): 


The infinite is nowhere to be found in reality, no matter what experiences, observations, and knowl- 
edge are appealed to. It neither exists in nature nor provides a legitimate basis for rational thought . . . 
The role that remains for the infinite to play is solely that of an idea ... which transcends all 
experience and which completes the concrete as a totality ...” 


The interesting cosmological claim in this regard is the idea that a spatial infinity nec- 
essarily results instantaneously from the tunnelling processes that are supposed to lead to 
new universe bubbles [40]. However, this does not work in a finite time if we remember 
that the origin of such a bubble cannot be exactly a spacetime point [34]. The potential 
spatial infinity arising from the tunnelling takes an eternity to attain completion, no matter 
how small the bubble nucleus is, and so is never attained in any finite time. That is the true 
nature of infinities in cosmology. They are always out of reach and remain potential rather 
than actualised. One can therefore propose as a basic principle: 
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Finiteness /nfinities, which are essentially unattainable by their very definition, should not occur as 
essential elements in any proposed physical theory, including cosmology. 


This contrasts with their frequent invocation by some cosmological theorists. It is a huge 
act of hubris to extrapolate from the observable universe domain to a supposed physi- 
cally existing infinity, never encountering any limit (remember the conformal diagram in 
Figure 1.3). Hilbert gives a more supportable position. 

In any event, claims of physically existing infinities of universes are not scientific state- 
ments — if science involves testability by either observation or experiment. No matter how 
long you count entities, you cannot prove there is an infinite number of them — if indeed 
you can see them, which you can not in this case (horizons!). 


1.6.6 Creation 


Various theories claim to explain the creation or start of the universe on a scientific basis, 
essentially in terms of some or other of the known laws of physics [98, 49, 57]. But as 
mentioned in Section 1.2.5, this is a very problematic claim. 

Classical general relativity theory predicts that if the usual energy conditions are sat- 
isfied, there should be a start to the universe [51]. Now this is a very dramatic event: if 
it occurs, it is a spacetime singularity, representing a start to matter and fields, to space 
and time, to physics itself. Can this be considered in a causal way? The whole concept of 
causality is called into question in this context, because it depends on the existence of the 
categories of space and time — which did not exist before the universe came into being. If 
so how, and in terms of what pre-existing entities or powers did this happen?” 

The attempts made to bring this within the powers of physics, however, assume some or 
other of these elements — specifically, the laws of physics, involving inter alia the entire 
structure of quantum field theory, Hilbert spaces, variational principles, symmetry princi- 
ples, and so on [80] — preceded the existence of the universe and so were able to bring it 
into being. In what way can these exist before the universe exists? What alternative ways 
are there of considering this? 

These issues were mentioned in Section 1.2.5. But the further issue here is, suppose we 
have a theory for creation of the universe (out of nothing, or out of something), can we 
scientifically test such proposals? How do we test what occurred before physical events 
existed? How do we test if laws based in spacetime features somehow had an existence 
before space or time had come into being? 

It is clear we can do no laboratory or collider experiments that are relevant to such theo- 
ries. If we have such a theory it may make some claims about what cosmological features 
will be visible today. But there will be no way to tie any such prediction uniquely to such a 
theory, as there is already a number of alternatives, usually with adjustable parameters that 
can be used to fit any predicted cosmological effects to the data. 


2 Tam very aware that the term “happen” hardly makes sense in this context; but we have no other way to think 
about it than to use this or a similar term. 


32 George F. R. Ellis 


1.6.7 Limits of Testing 


The burden of this section is to recognise the limits of testing theories in the cosmological 
context. 


Issue 4: Limits on testing and observations There are strict limits on observations in cosmology 
both in terms of space and time. There are also limits on testing the physics relevant to cosmological 
theories. These limits strongly constrain the degree to which we can observationally determine the 


nature of cosmological models on the largest scales and at the earliest times. 


Despite these limits, many strong claims are being made about what can be determined 
in cosmology. By great ingenuity we have in fact a well-tested model of what exists in 
the observable universe domain (Figure 1.3). Outside that domain, our models are not 
constrained by observations in the same way. 


1.7 Theory, Data, and the Limits of Science 


The very nature of the scientific enterprise is at stake in the string theory and multiverse 
debates: their proponents are proposing weakening the nature of scientific proof in order to 
claim that multiverses together with string theory provide a scientific explanation for what 
we see. This is a dangerous tactic. Leonard Susskind explicitly states the criteria for sci- 
entific theories should be weakened [56], as do Richard Dawid [22] and Sean Carroll [14]. 
The cosmological ideas and physical theories put forward are seen as being so compelling 
that we can loosen the need for empirical testing. 

On the face of it, this proposal for “post empirical science” (Dawid) is a highly ques- 
tionable proposal [33]. This is specifically where careful philosophical attention is needed 
to throw light on the argument put forward. 


1.7.1 Criteria for a Scientific Theory 
The usual criteria for a theory being scientific are [27, 28]: 


1. Satisfactory structure: (a) internal consistency, (b) simplicity (Ockham’s razor), (c) 
“beauty” or “elegance”, (d) usually, a mathematical formulation. 

2. Intrinsic explanatory power: (a) logical tightness, (b) scope of the theory — unifying 
otherwise separate phenomena. 

3. Extrinsic explanatory power: (a) connectedness to the rest of science, (b) extendability — 
a basis for further development. 

4. Observational and experimental support: (a) the ability to make quantitative predictions 
that can be tested; (b) confirmation: the extent to which the theory is supported by such 
tests. 


The problem is that in the context of cosmology, these will conflict with each other. You 
have to choose between them. It is particularly the last that characterizes a theory as 
scientific, and it is the one that is being brought into question. 
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In order to support multiverses and string theory as scientific theories, of course the first 
prize would be to show that these theories are indeed after all testable, in some way not 
recognized in the discussion above. Failing that, one has to consider in what way one can 
justify them as scientific when such tests are not available. 


1.7.2 Arguments 


Various arguments have been given by Dawid and Carroll, related to the underdetermina- 
tion of the theories by the possible observations. There is not space to do a proper response 
here: there is of course a large literature on the nature of scientific theories and their devel- 
opment and philosophical status. The purpose of this chapter however is to point out that 
there is serious philosophical work to be done in developing a rigorous framework for 
selecting what should be included in the scientific fold, and what is rather regarded as 
philosophy. I will just make a few comments. 

The main arguments for weakening the requirement for experimental testing seem to be 
the following: 


e The theory proposed is the only game in town. In the case of cosmology, it is the only 

explanation of the very small, observationally determined value of the cosmological con- 
stant, in contrast to quantum field theory predictions. In the case of string theory, it is the 
only well-developed proposal for unification of all four fundamental forces. 
But then firstly, it must indeed be the only explanation. As indicated above, trace-free 
gravity is a viable way to solve the vacuum energy problem: a multiverse is not the only 
possibility. Secondly, there may be no such game at all. In the case of string theory, it 
is assumed one can unify the strong, electroweak, and gravitational forces in a unified 
framework. But from a general relativity viewpoint, gravity is fundamentally not a force 
at all: it is an effect of spacetime curvature. It is therefore perfectly possible there is no 
such unification. 

e If the theory provides unexpected unifications of different areas of physics, it must be 

true. 
But this is a statement about mathematical models, it does not always imply physical 
realization of these mathematical theories; that has to be decided by experimental test. 
Thus for example anti-de Sitter/conformal field theory (CFT) dualities are proposed to 
give results about solid state physics, an astounding such unification. But multiverses do 
not provide any specific physics unification. 

e Extension of mathematical models in this way to develop elegant new theories has 

worked well in the past, so it will work again in the future. 
This ignores the cases where it did not work, for example the SU(5) Grand Unified 
Theory that was thought to be right because it was so simple and beautiful a unification 
of the electroweak and strong forces, but was eventually shown to be wrong by proton 
decay experiments. 
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As regards theories about the coming into existence of the universe itself, and what makes 
specific laws fly, there are many games in town. The one thing that is quite clear is that 
they cannot all be true, because they contradict each other. 


1.7.3, What Can we Know? 


This chapter has explored the limits to what can be tested in cosmology, and set the 
framework for considering its relation to the nature of science. 


Main Thesis 1: The standard model of cosmology is well tested within the observational hori- 
zons. However, it has problems to do with dark matter, dark energy, and the transition of quantum 
fluctuations to classical fluctuations. Additionally the nature of the inflaton is unknown. 


Main Thesis 2: Because of the limits on testing and observation, unless we live in a small universe 
there will always be uncertainty about what exists on the largest scales. In particular this applies 
to the supposed nature of any multiverse that is claimed to exist. If theories insist on claiming the 
physical existence of infinities of any kinds of entities, they are making claims that are well beyond 
the bounds of testable science. 


Main Thesis 3: Statements about the nature of/causes of the origin of the universe will of necessity 
always be speculative rather than proved science. Attempts to claim one has solved the issue of the 
origin of the universe and of Cosmologia on the basis of scientifically testable theories are unfounded. 
They are attempts to pass off philosophical predilections as established science. They mislead the 
public about what science can say about important issues. 


Philosophers of science should team up with scientists to clarify the boundaries of science 
in this case where testability is pushed beyond its limits. 
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Black Holes, Cosmology and the Passage of Time: 
Three Problems at the Limits of Science 


BERNARD CARR 


2.1 Introduction 


The boundary between science and philosophy is often blurred at the frontiers of knowl- 
edge. This is because one is dealing with proposals which are not amenable to the usual 
type of scientific tests, at least initially. Some scientists have an antipathy to philosophy and 
therefore regard such proposals disparagingly. However, that may be short-sighted because 
historically science had its origin in natural philosophy and the science/philosophy bound- 
ary has continuously shifted as fresh data accumulate. The criteria for science itself have 
also changed. So ideas on the science/philosophy boundary may eventually become proper 
science. Sometimes the progress of science may even be powered from this boundary, with 
new paradigms emerging from there. 

A particularly interesting example of this in the context of the physical sciences is 
cosmology. This is because the history of physics involves the extension of knowledge 
outwards to progressively larger scales and inwards to progressively smaller ones, and the 
scientific status of ideas at the smallest and largest scales has always been controversial. 
Cosmology involves both extremes and so is doubly vulnerable to anti-philosophical crit- 
icisms. While cosmography concerns the structure of the Universe on the largest scales, 
these being dominated by gravity, cosmogeny studies the origin of the Universe and 
involves arbitrarily small scales, where the other forces of nature prevail. Indeed, there 
is a sense in which the largest and smallest scales merge at the Big Bang. So cosmology 
has often had to struggle to maintain its scientific respectability and more conservative 
physicists still regard some cosmological speculations as going beyond proper science. 
One example concerns the current debate over the multiverse. The issue is not just whether 
other Universes exist but whether such speculations can be classified as science even if 
they do, since they may never be seen. 

While most of this chapter focuses on cosmology, two other problems straddling the 
boundary between physics and philosophy are also discussed. The first concerns black 
holes. Although these objects were predicted by general relativity a century ago, Albert 
Einstein thought they were just mathematical artefacts and it was 50 years before obser- 
vational evidence emerged for their physical reality. The first type to be established were 
stellar black holes but subsequently evidence has emerged for increasingly massive ones, 
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with ‘intermediate mass black holes’ being associated with gamma-ray bursts and ‘super- 
massive black holes’ powering quasars and perhaps even forming seeds for galaxies. 
Theorists also speculate about “primordial black holes’ which formed in the early stages 
of the Big Bang and could be much smaller than a solar mass. Like other Universes, we 
cannot be certain that primordial black holes existed — not because they are too far away 
but because they formed too early. Nevertheless, thinking about them has led to important 
physical insights and placed interesting constraints on the early Universe. 

Studies of cosmology and black holes have much in common. Historically, both have 
involved considering ever larger and smaller scales, and both become very speculative at 
the largest and smallest scales. Although Einstein himself was slow to appreciate their 
significance, they both derive from general relativity and yet may play a role in some final 
theory which goes beyond this. 

The final problem is even more speculative and concerns the flow of time. Unlike the 
other two problems, this one clearly goes beyond relativity theory and many would argue 
that it goes beyond physics altogether. However, I take a contrary view and propose an 
interpretation which involves higher dimensions and a very speculative link between cos- 
mology and black holes. So this third problem brings the other two together, although I 
should caution that my proposal trespasses even further into the domain of philosophy. 

The plan of this chapter is as follows. Section 2.2 expands on the notion of the out- 
ward and inward journeys of physics. Section 2.3 discusses the multiverse and M-theory 
as examples of ideas at the limits of science. Section 2.4 introduces the concept of ‘meta- 
cosmology’ to describe ideas on the border of cosmology and philosophy. Section 2.5 
reviews the evolving relationship between cosmology and metacosmology from a histori- 
cal perspective. Section 2.6 focuses on black holes and shows that similar considerations 
apply in this context. Section 2.7 proposes a very tentative model for the passage of time. 
Section 2.8 draws some general conclusions about the nature of science and the possible 
relevance of mind. 


2.2 The Macro-Micro Connection and the Triumph of Physics 


The outward journey into the macroscopic domain and the inward journey into the micro- 
scopic domain — mentioned in the Introduction — have revealed ever larger and smaller 
levels of structure in the Universe: planets, stars, galaxies, clusters of galaxies and the 
entire observable Universe in the macroscopic domain; cells, DNA, atoms, nuclei, sub- 
atomic particles and the Planck scale in the microscopic domain. These scales of structure 
are summarised in the image of the Cosmic Uroborus (the snake eating its own tail) shown 
in Figure 2.1. The numbers at the edge indicate the scale of these structures in centime- 
tres. As one moves anticlockwise from the tail to the head, the scale increases through 60 
decades: from the Planck scale (10~*cm) — the smallest meaningful distance allowed by 
quantum gravity — to the scale of the observable Universe (107’cm). So one can regard the 
Cosmic Uroborus as a clock, in which each minute corresponds to a decade in scale. 
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Figure 2.1 The Cosmic Uroborus summarises the different types of structure in the Universe. Also 
shown are the cross-links associated with various forces. The macro and micro scales merge at the 
Big Bang, where new physics may arise. 


The inward and outward journeys have also led to the discovery of the forces which 
determine the forms of these structures and the associated laws of nature. These forces are 
indicated by the horizontal lines in Figure 2.1 and link the macroscopic and microscopic 
domains. For example, the ‘electric’ line connects an atom to a planet because the force 
which binds the electron to a nucleus in an atom and the intermolecular force which binds 
solid objects are both electrical in origin. The ‘strong’ and ‘weak’ lines connect a nucleus 
to a star because the strong force which holds nuclei together also provides the energy 
released in the nuclear reactions which power a star and the weak force which causes nuclei 
to decay also prevents stars from burning out too soon. The ‘SUSY’ (supersymmetry) line 
connects ‘weakly interacting massive particles’ to galaxies because these may provide the 
‘cold dark matter’? (CDM) halos revealed by galactic rotation curves. The ‘GUT’ (grand 
unified theory) line connects with large-scale structure because the density fluctuations 
which led to this originated when the temperature of the Universe was high enough for 
GUT interactions to be important. 

This demonstrates that the macroscopic and microscopic domains are intimately linked, 
so that the outward and inward journeys are not disconnected but constantly throw light on 
each other. Indeed physics has revealed a unity about the Universe which makes it clear that 
everything is connected in a way which would have seemed inconceivable a few decades 
ago. The Big Bang might be regarded as the ultimate micro-macro link since it implies 
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that the entire Universe was once compressed to a tiny region of huge density. This is why 
the head of the snake meets the tail. Since light travels at a finite speed, we can never 
see further than the distance light has travelled since Big Bang; this is about 40 billion 
light-years, three times the age of the Universe times the speed of light because the cosmic 
expansion helps light travel further. Near this horizon, more powerful telescopes probe to 
earlier times rather than larger distances, so early Universe studies have led to an exciting 
collaboration between particle physicists and cosmologists. We now have a fairly complete 
picture of the history of the Universe after the first microsecond (Carr, 2013a). 

The inward and outward journeys have also led to new conceptual ideas and changes 
in our worldview. The outward one has led to the shifts from geocentric to heliocentric 
to galactocentric to cosmocentric worldviews and to the radical change of view of space 
and time entailed in relativity theory. The inward one has led to atomic theory, quantum 
theory and a progressively unified view of the forces of nature. Indeed, these developments 
have shattered our perspective of the microworld just as much as relativity theory has 
shattered our perspective of the macroworld. We do not fully understand what happens as 
one approaches the top of the Uroborus — one encounters the multiverse on the macroscopic 
side and M-theory on the microscopic side — but the possibility of incorporating gravity into 
the unification of forces has led some physicists to proclaim that we are on the verge of 
obtaining a ‘Theory of Everything’ (TOE). 

An intriguing feature of the top of the Uroborus is the possibility of extra dimen- 
sions. Unifying all the subatomic interactions requires extra wrapped-up dimensions of 
the kind proposed by Theodor Kaluza (1921) and Oskar Klein (1926) to explain electro- 
magnetism. For example, ‘superstring’ theory suggests there could be six and the way 
they are compactified is described by the Calabi-Yau group. There were originally five 
different superstring theories but it was later realised that these are all parts of a sin- 
gle more embracing model called ‘M-theory’, which has seven extra dimensions (Witten, 
1995). In one variant of this, proposed by Lisa Randall and Raman Sundrum (1999), the 
eleventh dimension is extended, so that the physical world is viewed as a four-dimensional 
(4D) ‘brane’ in a higher-dimensional ‘bulk’. The development of these ideas is sum- 
marised in Figure 2.2. We do not experience these extra dimensions directly because their 
effects only become important on the smallest and largest scales (i.e. at the top of the 
Uroborus). 


2.3 The Multiverse and M-Theory 


In contemplating the top of the Cosmic Uroborus, where we encounter the multiverse and 
M-theory, it is clear that we are stretching physics to its limits. This is because we cannot 
see further than the distance light has travelled since the Big Bang (107° m) or smaller than 
the distance (10~!?m) probed by the Large Hadron Collider (LHC), both these regimes 
being relevant to cosmology. However, whatever the practical problems of acquiring data 
in these regimes, it would be perverse to claim that there is no interesting physics there. In 
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this section, I will briefly discuss the multiverse and M-theory; see Carr (2007) for a more 
complete treatment. 

In the standard Big Bang theory, there should be unobservable expanding domains 
beyond the horizon which are still part of our Big Bang. This is what Max Tegmark (2003) 
calls the ‘Level I’ multiverse and it is relatively uncontroversial. If taken to extremes, it 
leads to some bizarre possibilities (like our having identical clones at some stupendous 
distance) and George Ellis highlights the hubris involved in such an extrapolation (Carr 
and Ellis, 2008). However, it would be hard to deny its existence at some level. 

Recent developments in cosmology and particle physics have led to the more radical 
proposal that there could also be other Big Bangs which are distinct from ours. These 
might be regarded as the inevitable outcome of the physical processes that generated our 
own Universe and form what Tegmark classifies as the ‘Level II’ multiverse. Some of 
the proposals come from cosmologists and others from particle physicists, so the Level 
II multiverse might be regarded as the culmination of scientific attempts to understand 
the largest and smallest scales in Figure 2.1. In a way, Ellis’s hubris argument supports 
the Level II multiverse, because it requires even more hubris to assume that the Level I 
multiverse extends forever. Indeed, the density fluctuations seen in the cosmic microwave 
background (CMB) would be of order unity if extrapolated to a scale of 10!°° times the 
current horizon, so this might be taken to indicate the scale of Level II structure. Although 
this is large, it is gratifying that it is much less than the scale at which Tegmark’s clones 
appear! 

Let us first examine the cosmological proposals. Some invoke ‘oscillatory’ models in 
which a single Universe undergoes cycles of expansion and recollapse (Tolman, 1934). 
In this case, the different Universes are strung out in time. Others invoke the inflationary 
scenario, in which our observable domain is a tiny part of a bubble, which underwent an 
accelerated expansion phase at some early time as a result of false vacuum effects (Guth, 
1981). In this case, there could be many other bubbles, corresponding to other Universes 
spread out in space. A variant of this idea is ‘eternal’ inflation, in which the Universe is 
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Figure 2.2 The sequence of extra dimensions invoked in modern physics (right), one of which is 
extended in brane cosmology (left). 
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continually self-reproducing, so that there are an infinite number of bubbles extending in 
both space and time (Vilenkin, 1983; Linde, 1986). 

Turning to multiverse proposals inspired by particle physics, we have seen that in one 
version of M-theory our Universe is a 4D brane in a five-dimensional (SD) bulk. In this 
case, there might be many other branes, corrresponding to a multiverse extending in the 
fifth dimension, and collisions between neighbouring ones might even generate Big Bangs 
(Steinhardt and Turok, 2006). In the standard version of M-theory, with no extended 
dimension, the vacuum state is determined by the Calabi-Yau compactification. Recent 
developments suggest that the number of these compactifications could be enormous (e.g. 
10°), each one corresponding to a Universe with a different value of the cosmological 
constant (A) and a different set of physical constants (Bousso and Polchinski, 2000). So in 
this ‘string landscape’ scenario, the constants would be contingent on which Universe we 
happen to occupy (Susskind, 2005). 

One problem with the last scenario is that the observed cosmological constant seems 
implausibly small. In principle, its value could lie anyway between plus and minus the 
Planck density, which is 120 orders of magnitude larger than the observed value. There is 
also the puzzling feature that the observed vacuum density is currently very similar to the 
mean matter density, a coincidence which would only apply at a particular cosmological 
epoch. One cannot predict the distribution of A across the different Universes precisely in 
this picture but it would be very unlikely to have a peak in the observed range. However, 
as first pointed out by Steven Weinberg (1987) and later discussed by George Efstathiou 
(1995) and Alex Vilenkin (1995), the value of A is constrained because galaxies could not 
form if it were much larger than observed. Since the growth of density perturbations is 
quenched once A dominates the density, if bound systems have not formed by then, they 
never will. Thus, if one were to contemplate an ensemble of Universes with a wide spread 
of values of A, our own existence would require that we occupy one of the tiny fraction 
in which the value is sufficiently small. This is an example of an ‘anthropic’ fine-tuning 
argument — indeed, in terms of the precision required, it may be the most impressive tuning 
of all. 

The oldest form of the multiverse is the ‘many worlds’ interpretation of quantum 
mechanics (Everett, 1957), in which there is no quantum collapse but the Universe branches 
every time an observation is made. Tegmark classifies this as the ‘Level III’ multiverse and 
it is the most natural framework in which to describe quantum cosmology, which applies 
when the classical spacetime description of general relativity breaks down at the Planck 
time (10~* s). In this approach, one has a superposition of different histories for the 
Universe and uses the path integral formalism to calculate the probability of each history 
(Hartle and Hawking, 1983). In some models, this replaces the Big Bang with a bounce 
and leads to a form of the cyclic model. 

The most speculative version of the multiverse, described by Tegmark as ‘Level IV’, 
postulates disconnected Universes governed by different laws based on different mathe- 
matical structures. It derives from an underlying philosophical stance that everything that 
can happen in physics does happen, so any mathematically possible Universe must exist 
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‘somewhere’ (Tegmark, 1998). This is certainly the most philosophical of the multiverse 
proposals. 

Despite the popularity of the multiverse proposal, some physicists are deeply uncom- 
fortable with it. The ideas involved are highly speculative and they are currently — and may 
always remain — untestable in the sense that astronomers may never be able to observe 
the other Universes directly. For these reasons, some physicists do not regard these ideas 
as coming under the purview of science at all. Since our confidence in them is based on 
faith rather than experimental data, they seem to have more in common with religion. On 
the other hand, evidence for other Universes might eventually be forthcoming — for exam- 
ple, from the scars of collisions with other Universes in the CMB (Garriga et al., 2007; 
McEwen et al., 2012) or from dark flows (Kashlinsky et al., 2009) or from cosmic wakes 
(Mersini-Houghton and Holman, 2009). 

Another argument might be that an idea can be regarded as scientific if it is implied by 
a theory which is testable. So if a theory predicts the multiverse, then verifying that theory 
would at least provide partial evidence. For example, this would apply if the extra dimen- 
sions of M-theory were detected with the LHC. If they are not detected, one might argue 
that extra dimensions are mathematical rather than physical (Woit, 2006; Smolin, 2007). 
But many ideas in modern physics might be regarded as mathematical and the picture of 
ultimate reality provided by theorists deviated from the common-sense reality provided by 
our physical senses long before we reached the stage of the multiverse and M-theory. 

If no direct evidence for other Universes or M-theory is forthcoming, one could still 
argue that the anthropic fine-tunings provide indirect evidence for other Universes (Carter, 
1974; Carr and Rees, 1979; Barrow and Tipler, 1986; Hogan, 2000). Although the multi- 
verse proposal was not originally motivated by an attempt to explain these tunings, it seems 
clear that the two concepts are interlinked. For if there are many Universes, the question 
arises as to why we inhabit this particular one and (at the very least) one would have to con- 
cede that our own existence is a relevant selection effect. A huge number of Universes will 
allow all possibilities and combinations to occur, so somewhere — just by chance — things 
will be right for life. Many physicists therefore regard the multiverse as providing the most 
natural explanation of the anthropic fine-tunings. If one wins the lottery, it is natural to 
infer that one is not the only person to have bought a ticket. 

In assessing the multiverse interpretation of the fine-tunings, a key issue is whether some 
of the physical constants are contingent on accidental features of symmetry-breaking and 
the initial conditions of our Universe or whether some fundamental theory will determine 
all of them uniquely (Rees, 2001). The two cases correspond to the multiverse and single 
Universe options, respectively. This relates to Einstein’s famous question: “Did God have 
any choice when he created the Universe?’ If the answer is no, there would be no room 
for the Anthropic Principle. If the answer is yes, then trying to predict the values of the 
constants would be as forlorn as Kepler’s attempts to predict the spacing of the planets 
based on the Platonic solids. 

Even if one accepts that the anthropic fine-tunings derive from a multiverse selection 
effect, what determines the selection? If it relates to the presence of observers, is some 
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minimum threshold of intelligence required or does the mere existence of consciousness 
suffice? Or does it just reflect some form of ‘life principle’, as advocated by Paul Davies 
(2006)? I have argued elsewhere (Carr, 2007) that the Anthropic Principle should really be 
called the Complexity Principle, in which case the connection with consciousness or life 
may be incidental. 

Another important question is whether our Universe is typical or atypical within the 
ensemble. Advocates of the Anthropic Principle usually assume that life-forms similar to 
our own will be possible in only a tiny subset of Universes. More general life-forms may be 
possible in a somewhat larger subset but life will not be possible everywhere. On the other 
hand, by invoking a Copernican perspective, Lee Smolin (1997) has argued that most of 
the Universes should have properties like our own, so that ours is typical. His own favoured 
model is a particular example of this; I will describe this in some detail because it links the 
topics of this chapter. 

Smolin argues that the physical constants have evolved to their present values through 
a process akin to mutation and natural selection. The underlying physical assumption is 
that whenever matter gets sufficiently compressed to undergo gravitational collapse into a 
black hole, it gives birth to another expanding Universe in which the fundamental constants 
are slightly mutated. Since our own Universe began in a state of great density (i.e. with a 
Big Bang), it may itself have been generated in this way (i.e. via gravitational collapse 
in some parent Universe). Cosmological models with constants permitting the formation 
of black holes will therefore produce progeny (which may in turn produce further black 
holes since the constants are nearly the same), whereas those with the wrong constants 
will be infertile. Through successive generations of Universes, the physical constants will 
then naturally evolve to have the values for which black hole (and hence baby Universe) 
production is maximised. Smolin’s proposal involves very speculative physics, since we 
have no understanding of how the baby Universes are born, but it has the virtue of being 
testable, since one can calculate how many black holes would form if the parameters were 
different. There is no need for the constants to be determined by a TOE and no need for 
the Anthropic Principle, since observers are just an incidental consequence of the Universe 
being complex enough to give rise to black holes. 


2.4 Cosmology and Metacosmology 


Although cosmology is now recognised as part of mainstream physics, it is different from 
most other branches of science: one cannot experiment with the Universe or observe other 
ones, and speculations about processes at very early and very late times depend upon the- 
ories of physics which may never be directly testable. This is why some cosmological 
speculations are dismissed as being too philosophical. I use the term “‘metacosmology’ 
to describe aspects of cosmology which might be regarded as bordering on philosophy 
(Carr, 2014), although we will see that one cannot delineate the cosmology/metacosmology 
border precisely. 
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Figure 2.3(a) This illustrates the sequence from cosmology to metacosmology to theology. The 
arrow indicates that the cosmology/metacosmology boundary evolves, so that today’s metacosmol- 
ogy becomes tomorrow’s cosmology. 


(b) 


Figure 2.3(b) Three possible explanations of the anthropic fine-tunings, these becoming increasingly 
‘wild’ from a physics perspective as one moves to the right. The sequence is connected with Figure 
2.3(a) but the precise correspondence is fuzzy. 


Cosmology also relates to theology. All cultures have their creation myths (a sceptic 
might claim that the Big Bang theory is just the most recent one) and issues about the 
origin and future of the Universe also arise in religion. Of course, the remit of religion 
goes well beyond the materialistic issues which are the focus of cosmology. However, in so 
much as religious and cosmological truths overlap, they must be compatible. This has been 
stressed by George Ellis (this book), who distinguishes between “cosmologia’, which takes 
into account ‘the magnificent gestures of humanity’, and cosmology, which just focuses on 
physical aspects of the Universe. Most scientists are even more uncomfortable straying into 
the domain of theology than philosophy. So it may be appropriate to regard cosmology, 
metacosmology and theology as forming a sequence, with metacosmology occupying a 
middle ground, as illustrated in Figure 2.3(a). 

As an example of a problem which may impinge on all three areas, consider the issue 
of the anthropic fine-tunings. One might consider three interpretations of these tunings, as 
illustrated in Figure 2.3(b). The first possibility is that there is not just one Universe but lots 
of them, all with different coupling constants, and that we necessarily reside in one of the 
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small fraction which satisfies the anthropic constraints (Carr, 2007). At least some physi- 
cists would regard this explanation as scientific and therefore put it in the cosmology box 
of Figure 2.3(a). The second possibility, based on the notion that the Universe is described 
by a quantum mechanical wave function and that consciousness is required to collapse 
this, is that the Universe does not exist until consciousness has arisen. Once it has done so, 
one might think of it as reflecting back on the Big Bang, thereby forming a closed circuit 
which brings the world into existence (Wheeler, 1977). Even if consciousness really does 
collapse the wave function (which is far from certain), this explanation might be regarded 
as metaphysical and so deemed to be in the metacosmology box. The third possibility is 
that the fine-tunings reflect the existence of a ‘Creator’ or ‘God’ who tailor-made the Uni- 
verse for our benefit by placing a ‘pin’ carefully in the space of coupling constants (Leslie, 
1989; Holder, 2004). This explanation clearly belongs to the theology box. 

These possibilities represent a sliding scale of increasing unpalatability for most physi- 
cists, with the A word (Anthropic), the C word (Consciousness) and the G word (God) 
being increasingly taboo. The hard-line physicist may not be entirely comfortable invok- 
ing a multiverse but it is better than appealing to consciousness and definitely preferable 
to God. This does not mean any of the three explanations is less logical; it just reflects 
the physicist’s bias towards an explanation which is scientific. In any case, the dichotomy 
between God and multiverse is simplistic (Collins, 2007). While the fine-tunings certainly 
do not provide unequivocal evidence for God, nor would the existence of a multiverse pre- 
clude God. On the other hand, if there’s only one Universe, the argument for a fine-tuner 
may become more compelling, I will not discuss such theological issues any further here. 

It should be stressed that opinions differ as to the location of the cosmol- 
ogy/metacosmology boundary in Figure 2.3(a). Some physicists would regard the mul- 
tiverse as philosophy, with only a single self-created Universe of the kind envisaged by 
Stephen Hawking (2001) qualifying as physics. Davies (2006) even relegates it to theol- 
ogy, regarding the concepts of a multiverse and a Creator as equally metaphysical. The 
boundary between cosmology and metacosmology is therefore fuzzy. More importantly, I 
argue below that it is constantly evolving (hence the arrow in Figure 2.3(a), so that today’s 
metacosmology becomes tomorrow’s cosmology. The controversy over whether the multi- 
verse should be classified as cosmology or metacosmology was the focus of my dialogue 
with Ellis, in which I defended and he opposed its scientific status (Carr and Ellis, 2008). 
My present view is that the multiverse should currently be regarded as metacosmology but 
I anticipate that the cosmology/metacosmology boundary will eventually shift sufficiently 
for it to be reclassified as cosmology. So the arrow in Figure 2.1 represents my attempt to 
compromise with Ellis. 

Similar issues arise at the microscopic frontier, which I have argued is also part of 
cosmology. Although the advent of atomic theory in the eighteenth century yielded cru- 
cial insights into thermodynamics and chemistry, many scientists were sceptical of atoms 
which could not be seen. Later it was realised that an atom is mainly empty space and that 
its constituents are not solid particles but described by a quantum wave function which is 
smeared out in space. But the long debate about the interpretation of quantum mechanics 
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Table 2.1. Lessons of history 


Lesson 1 Theoretical prejudice should not blind one to the evidence 
Lesson 2 New observational developments are hard to anticipate 
Lesson 3 Don’t reject a theory because it has no observational support 
Lesson 4 Don’t be deterred by the opposition of great scientists 
Lesson5 Be prepared to apply known physics in new domains 
Lesson 6 Majority opinion and expectation are often wrong 

Lesson7 Metacosmology becomes cosmology because of new data 
Lesson 8 The tide of history may be against the cosmocentric view 
Lesson9 The nature of legitimate science changes with time 


is still dismissed as philosophy by many physicists. We have seen that proposed TOEs are 
sometimes dismissed as mathematics but whether that should be regarded as physics or 
philosophy is itself contentious. 


2.5 Historical perspective of Cosmology/Metacosmology 


In this section I will argue that the boundary between metacosmology and cosmology 
is continuously shifting as new ideas evolve from being pre-scientific to scientific. The 
domain of legitimate science is thus always expanding. Some historical examples will be 
used to illustrate this point, each of which conveys a particular lesson. These lessons are 
discussed in more detail elsewhere (Carr, 2014) and are summarised in Table 2.1. 

To the ancient Greeks, the heavenly spheres were the unchanging domain of the divine 
and therefore outside science. It required Tycho Brahe’s observation of a supernova in 1572 
and the realisation that its apparent position did not change as the Earth moved around the 
Sun to dash that view. Because this contradicted the Aristotelian view that the heavens 
cannot change, the claim was at first received sceptically. Frustrated by those who had eyes 
but would not see, Brahe wrote: ‘O crassa ingenia. O coecos coeli spectators’. [Oh thick 
wits. Oh blind watchers of the sky.] Lesson 1: Theoretical prejudice should not blind one 
to the evidence. I would claim that the analogue of Tycho’s supernova in the multiverse 
debate is the fine-tunings, even though we are literally ‘blind’ in that sense that we cannot 
see the other Universes. 

Long after Galileo had realised that the Milky Way is nothing more than an assem- 
blage of stars and Newton had shown that the laws of nature could be extended beyond the 
Solar System, there was still a prejudice that the investigation of this region was beyond 
the domain of science. In 1835 August Comte commented on the study of stars: “We will 
never be able by any means to study their chemical compositions. The field of positive 
philosophy lies entirely within the Solar System, the study of the Universe being inacces- 
sible in any possible science.’ Comte had not foreseen the advent of spectroscopy, which 
identified absorption features in stellar spectra with chemical elements. Lesson 2: New 
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observational developments are hard to anticipate. Perhaps one day we will find extra 
dimensions at the LHC or create baby Universes in the laboratory or visit other Universes 
through wormholes. 

Cosmology attained the status of a proper science in 1915, when the advent of gen- 
eral relativity gave it a secure mathematical basis. Nevertheless, for a further decade there 
was resistance to the idea that science could be extended beyond the Galaxy. Indeed, 
many astronomers refused to believe that there was anything beyond. Although Immanuel 
Kant had speculated, as early as 1755, that some nebulae are ‘island Universes’, simi- 
lar to the Milky Way, most astronomers continued to adopt a Galactocentric view until 
the 1920s. Indeed, the term ‘Universe’ became taboo in some quarters. Ernest Rutherford 
once remarked ‘Don’t let me hear anyone use the word “Universe” in my Department!’ 
a comment some might echo today about the multiverse. The controversy came to a head 
in 1920 when Heber Curtis defended the island Universe theory in a famous debate with 
Harlow Shapley. The issue was finally resolved in 1924, when Edwin Hubble measured 
the distance to M31 using Cepheid variable stars. 

A few years later Hubble — using radial velocities for several dozen nearby galaxies 
obtained by Vesto Slipher — discovered that all galaxies are moving away from us with a 
speed proportional to their distance. Alexander Friedmann had predicted this in 1922 on the 
basis of general relativity but Einstein rejected this model at the time because he believed 
the Universe (i.e. the Milky Way) was static and he invoked the cosmological constant to 
allow this possibility. In fact, Einstein continued to uphold the static model even after the 
evidence was against it and he only accepted the expanding model — admitting his “biggest 
blunder’ — several years after Hubble published his data. Lesson 3: Don’t reject a theory 
because it has no observational support. Knowing how much weight to attach to theory 
and observation can be tricky. 

Georges Lemaitre — who derived Friedmann’s equations independently — was the first 
person to consider the implications of the Universe having started in a state of great com- 
pression and is known as the father of the Big Bang theory. This is now almost universally 
accepted but the reaction of some of his contemporaries is revealing. Einstein remarked to 
Lemaitre in 1927: “Your maths is correct but your physics is abominable’. Arthur Edding- 
ton was equally loathe to accept the implications of the cosmic expansion: “Philosophically, 
the notion of a beginning of the present order of nature is repugnant to me.’ In contrast to 
Lemaitre, he regarded the Big Bang as an unfortunate fusion of physics and theology. 
Lesson 4: Don’t be deterred by the opposition of great scientists. What is regarded as 
respectable in cosmology is determined by a handful of influential people and this works 
both ways. One reason the Anthropic Principle has become more respectable is that several 
physicists of great stature have now embraced it. 

Even after Hubble’s discovery gave cosmology a firm empirical foundation, it was many 
decades before it gained full scientific recognition. When Ralph Alpher and Robert Her- 
man were working on cosmological nucleosynthesis in the 1940s, they recall: ‘Cosmology 
was then a sceptically regarded discipline, not worked in by sensible scientists’ (Alpher 
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and Hermann, 1988). Although they were only using known physics, people were scepti- 
cal of applying this in unfamiliar contexts. Only with the detection of the CMB in 1964 
was the hot Big Bang theory established as a branch of mainstream physics and subse- 
quent studies of this radiation have established cosmology as a precision science. Lesson 
5: Be prepared to apply known physics in new domains. Admittedly, some cosmological 
speculations depend upon unknown physics but that relates to Lesson 4. 

The last few decades have seen even more dramatic developments. First, although one 
expects the expansion of the Universe to slow down because of gravity, observations of 
distant supernovae suggest that it is accelerating (Riess et al., 1998; Perlmutter et al., 1999). 
We do not know for sure what is causing this but it must be some exotic form of “dark 
energy’, most probably related to the cosmological constant introduced by Einstein to make 
the Universe static. These discoveries have led to the concordance ‘ACDM’ model. For 
almost 90 years (ever since the demise of Einstein’s static model) it had been assumed 
that the cosmological constant is zero, even though theorists had no clear understanding of 
why. Lesson 6: Majority opinion and expectation is often wrong. However, the nature of 
the dark energy is still uncertain. We cannot be sure that it is a cosmological constant and 
some people would advocate a more radical departure from the standard model (Lahav and 
Massini, 2014). 

According to the inflationary scenario (Guth, 1981), the early Universe would also have 
undergone an accelerating phase as a result of the vacuum energy. For many years, such 
speculations were dismissed by some cosmologists as being too remote from observations. 
However, this changed when anisotropies in the CMB were discovered by COBE (Smoot 
et al., 1992) and then probed to ever greater precision with WMAP (Spergel et al., 2003) 
and the Planck Collaboration (2014). The fluctuations have exactly the form predicted by 
the inflationary scenario and this allows us to determine cosmological parameters very pre- 
cisely. This illustrates a crucial point about the evolution of the cosmology/metacosmology 
boundary. Lesson 7: Metacosmology becomes cosmology because of new data. The follow- 
ing quote from Efstathiou (2013) is pertinent here: ‘Such ideas may sound wacky now, just 
like the Big Bang did three generations ago. But then we got evidence and it changed the 
way we think about the Universe.’ However, it should be cautioned that many claimed 
effects go away (e.g. magnetic monopoles, various dark matter candidates, primordial 
gravitational waves), so one needs to wait for observational claims to be confirmed. 

We have seen that another idea which has become popular is that of the multiverse. More 
conservative cosmologists would prefer to maintain the cosmocentric view that ours is the 
only Universe but one cannot exclude the possibility that our observable Universe is just 
a miniscule part of a much larger physical reality. In some ways, this parallels the debate 
about extragalactic nebulae a century ago. The evidence for other Universes can never be as 
decisive as that for extragalactic nebulae but the transformation of worldview required may 
be just as necessary. Lesson 8: The tide of history may be against the cosmocentric view. 
Helge Kragh (2013) has criticised me for using historical considerations as an argument 
for the multiverse itself but my argument is merely intended as a sociological comment 
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and I would never claim that historical considerations carry the same weight as empirical 
evidence. 

The final lesson of history is nicely reflected in a comment by Weinberg (2007): ‘We 
usually mark advances in the history of science by what we learn about nature, but at 
certain critical moments the most important thing is what we discover about science itself. 
These discoveries lead to changes in how we score our work, in what we consider to be 
an acceptable theory.’ Weinberg is referring specifically to M-theory and the multiverse 
but this insightful remark may be of more general application. Lesson 9: The nature of 
legitimate science changes with time. I return to this issue at the end of this chapter. 


2.6 Black Holes and the Limits of Science 


General relativity predicts that, if the matter in a region is sufficiently compressed, gravity 
becomes so strong that it forms a black hole. In the simplest case, where space has no 
hidden dimensions, the size of the black hole (the Schwarzschild radius) is proportional to 
its mass, Ry; = 2GM/c?. Thus the density to which matter must be squeezed scales as the 
inverse square of the mass. The Sun would need to be compressed to a radius of about 3 
km to become a black hole, corresponding to a density of about 10!° kg m~? or 100 times 
nuclear density. The Sun itself is not expected to evolve to a black hole but there is a wide 
range of masses above | Mo in which black holes could form at the present epoch and they 
might form below | Mo at much earlier epochs. [See Chapter | of Calmet et al. (2014) for 
a detailed discussion. | 

The most plausible mechanism for black hole formation is the collapse of stars which 
have completed their nuclear burning. However, this only happens for sufficiently massive 
stars. Those smaller than 4 Mo evolve into white dwarfs because the collapse of their 
remnants can be halted by electron degeneracy pressure. Stars larger than 4 Mo but smaller 
than about 100 Mo burn stably until they form an iron/nickel core, at which point no 
more energy can be released by nuclear reactions, so the core collapses. If the collapse 
can be halted by neutron degeneracy pressure, a neutron star will form and a reflected 
hydrodynamic shock then ejects the envelope of the star, giving rise to a type II supernova. 
If the core is too large, however, it necessarily collapses to a black hole. Above 40 Mo the 
core collapses directly but for 20-40 Mo collapse is delayed and occurs due to fallback of 
ejected material (MacFadyen et al., 2001). 

The first stars to form in the Universe may have been larger than 100 Mo. Such ‘Very 
Massive Objects’ (VMOs) are radiation-dominated and unstable to nuclear-energised pul- 
sations during their hydrogen- and helium-burning phases. It used to be thought that the 
resulting mass loss would be so rapid as to preclude the existence of VMOs. However, the 
pulsations are now expected to be dissipated as a result of shock formation and this could 
quench the mass loss enough for them to survive for their main-sequence time (which is just 
a few million years). VMOs encounter an instability when they commence oxygen-core 
burning because the temperature in this phase is high enough to generate electron-positron 
pairs (Fowler and Hoyle, 1964). This has the consequence that smaller cores explode, 
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while larger ones collapse to ‘Intermediate Mass Black Holes’ (IMBHs). Both numeri- 
cal (Woosley and Weaver, 1982) and analytical (Bond et al., 1984) calculations indicate 
that this happens for VMOs above about 200 Mo. When IMBHs were suggested as dark 
matter candidates 30 years ago (Carr et al., 1984), their existence was regarded sceptically. 
However, it is now thought that IMBHs may power ultra-luminous X-ray sources or be 
associated with Gamma-Ray Bursts. They may also exist in the nuclei of some Globular 
Clusters, formed perhaps from the coalescence of smaller mass black holes. 

Stars in the mass range above 10° Mo are unstable to general relativistic instabilities 
and may collapse directly to ‘Supermassive Black Holes’ (SMBHs) without any nuclear 
burning at all (Fowler, 1966). One could plausibly envisage their formation at the centres of 
dense star clusters through dynamical relaxation: the stars would be disrupted through col- 
lisions and a single supermassive star could then form from the newly released gas. SMBHs 
might also form from the coalescence of smaller holes or from accretion onto a single hole 
of more modest mass. SMBHs are known to reside in galactic nuclei (Kormendy and Rich- 
stone, 1995). Our own galaxy harbours a 4 x 10° Mo black hole and quasars — which 
represent an earlier evolutionary phase of galaxies — are thought to be powered by 10° Mo 
black holes. The largest black hole to date has a mass of 2x 10!° Mg (Thomas et al., 2016). 
SMBH formation does not entail extreme physical conditions, since an object of 10? Mo 
would only have the density of water on falling inside its event horizon. Nevertheless, it 
has taken much longer for their existence to be accepted than it has for stellar black holes. 

In the early 1970s it was realised that primordial black holes (PBHs) could have formed 
in the early Universe and be much smaller than a solar mass. This is because the cos- 
mological density was very high at early times, exceeding nuclear density within the first 
microsecond of the Big Bang and rising indefinitely at earlier times. A comparison of cos- 
mological density and Schwarzschild density implies that a PBH forming at time f must 
have a mass c*t/G ~ 10° (t/s) Mo. This is just the mass within the cosmological horizon, 
so they could span an enormous mass range: from 10~> g for those forming at the Planck 
time to 1 Mo for those forming at the quark-hadron phase transition (10~*s) to 10° Mo 
for those forming at | s (Hawking, 1971). Such PBHs could have formed either from ini- 
tial inhomogeneities or spontaneously at some sort of cosmological phase transition. [See 
Chapter 4 of Calmet et al. (2014) for a detailed discussion. ] 

PBHs smaller than 107! kg (about the mass of the Moon) might be regarded as ‘micro- 
scopic’, in the sense that they are smaller than a micron, but they could still have interesting 
astrophysical consequences. For example, they could collide with the Earth or have 
detectable lensing and dynamical effects or provide the dark matter (Carr et al., 2010). 
Those smaller than 10!* kg (about the mass of a mountain) would be smaller than a proton 
and have dramatically different consequences. This is because in 1974 Hawking discovered 
that a black hole radiates thermally with a temperature inversely proportional to its mass 
(Hawking, 1974). For a solar-mass black hole, the temperature is around 10~° K, which is 
negligible. But for a black hole of mass 10!” kg, itis 10!” K, corresponding to the emission 
of 100 MeV gamma-rays. Because the emission carries off energy, the mass of the black 
hole decreases. As it shrinks, it gets hotter, emitting increasingly energetic particles and 
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shrinking ever faster. The time for a black hole to evaporate completely is proportional to 
the cube of its initial mass. For a solar-mass hole, this is an unobservably long 10% y but 
it is the present age of the Universe (10!° y) for 10!* kg. Thus PBHs with this initial mass 
would be completing their evaporation today and smaller ones would have evaporated at 
an earlier cosmological epoch. Since microscopic PBHs (i.e. those smaller than 107! kg) 
are hotter than the CMB, they can also be regarded as ‘quantum’ in the sense that their 
emission is not suppressed by accretion of radiation. 

When the black hole mass gets down to about 10° kg, the evaporation becomes explo- 
sive, the remaining energy being released in just a second. Such explosions are unlikely 
to be detectable in the simplest picture but David Cline and collaborators (e.g. Cline and 
Otwinowski, 2009) have suggested that the PBH explosions might explain some short- 
timescale gamma-ray bursts. Future observations will settle this issue but even if PBH 
explosions are not detected, or even if PBHs never formed, Hawking’s work was still a 
tremendous conceptual advance. This is because it unified three previously disparate areas 
of physics: general relativity, quantum theory and thermodynamics. So it has been useful 
to study PBHs even if they never formed! 

Hawking’s theory was only a first step towards a full quantum theory of gravity, since 
his analysis breaks down when the density reaches the Planck value of about 107” kg 
m7? because of quantum gravitational fluctuations in the spacetime metric. An evaporating 
black hole reaches this density when it gets down to a radius of 10~*° m and a mass of 1078 
kg. Such a ‘Planckian black hole’ is much smaller in size but much bigger in mass than 
an elementary particle. A theory of quantum gravity would be required to understand the 
formation and evaporation of such an object. This might even allow black holes to leave 
stable Planck-mass relics rather than evaporating completely, in which case these relics 
could be dark matter candidates (MacGibbon, 1978). 

Another factor may come into play as a black hole shrinks towards the Planck scale: 
the existence of extra dimensions. As indicated in Figure 2.2, the unification of the forces 
of nature may require the existence of extra ‘internal’ dimensions beyond the four dimen- 
sions of spacetime. These are usually assumed to be compactified on the Planck scale, 
in which case their effects are unimportant for black holes heavier than the Planck mass. 
However, they are much larger than the Planck length in some models and this has the 
consequence that gravity should grow much stronger at short distances than implied by 
the Newtonian inverse-square law (Arkani-Ahmed ef al., 2000). In other models, the extra 
dimensions are ‘warped’ so that matter is confined to a 4D hypersurface, but this has the 
same gravity-magnifying effect (Randall and Sundrum, 1999). In either case, the standard 
estimate of the Planck energy (and hence the minimum mass of a black hole) could be too 
high. 

This has the important implication that black holes could be made in accelerators. For 
example, protons at the LHC reach an energy of roughly 10 TeV, which is equivalent to a 
mass of 10-7? kg. When two such particles collide, one might wonder whether they can 
get close enough to form a black hole. In the standard picture, the likelihood of this is very 
small because 10~7° kg is much less than the Planck mass of 10~® kg. However, if there 
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Figure 2.4 The Cosmic Uroboros applied to black holes as a link between macro and micro physics. 
QSO stands for ‘Quasi-Stellar Object’, MW for ‘Milky Way’, IMBH for ‘Intermediate Mass Black 
Hole’ and LHC for ‘Large Hadron Collider’. 


are large extra dimensions, the Planck scale is lowered and the energy required to create 
black holes could lie within the LHC range. They would evaporate almost immediately, 
lighting up the particle detectors like Christmas trees (Carr and Giddings, 2005). Although 
there is still no evidence for this, it opens up the exciting prospect of probing black hole 
evaporation, higher dimensions and quantum gravity itself. 

The crucial role of black holes in linking macrophysics and microphysics is summarised 
in Figure 2.4. The various types of black holes are labelled by their mass, this being propor- 
tional to their size if there are three spatial dimensions. On the right are the well-established 
astrophysical black holes. On the left — and possibly extending somewhat to the right — 
are the more speculative PBHs. If the extra dimensions at the top of the Uroborus are 
much larger than the Planck length, then we have seen that black holes might be pro- 
duced in accelerators. These are not themselves primordial but this would have important 
implications for PBH formation. 

The history of cosmology and black holes reveals interesting similarities. Both have 
involved an expansion to larger and smaller scales — often against scepticism — and the 
ideas involved may become untestable at the largest and smallest scales. At the micro end, 
we have the possibility of extra dimensions and the Black Hole Uncertainty Principle cor- 
respondence (Carr, 2013b). At the macro end, we have the birth of Universe through black 
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Figure 2.5 Illustrating three problems of consciousness from a relativistic perspective: (a) the pas- 
sage of time; (b) the selection of possible futures; (c) the coordination of time for spatially separated 
observers. 


hole formation (Smolin, 1997) and the persistence of black holes through a cosmological 
bounce (Carr and Coley, 2011). All these proposals might be regarded as ultra-speculative. 
So the macro and micro frontiers correspond to physics/philosophy boundaries for black 
holes as well as cosmology and similar issues arise (e.g. the detection of PBH explosions 
in the black hole scenario versus bubble collisions in the multiverse scenario). 


2.7 The Flow of Time and Higher Dimensions 


A long-standing problem on the interface of physics and philosophy concerns the pas- 
sage of time. In the ‘block Universe’ of special relativity, the three-dimensional object is 
just the ‘constant-time’ cross-section of an immobile 4D world-tube and we come across 
events as our field of consciousness sweeps through the block. However, nothing within the 
spacetime picture describes this sweeping or identifies the particular moment at which we 
make our observations. Past and present and future coexist, so if one regards conscious- 
ness as crawling along the world-line of the brain, like a bead on a wire, as illustrated 
in Figure 2.5(a), that motion itself cannot be described by relativity theory. Thus there 
is a fundamental distinction between physical time (associated with special relativity and 
the outer world) and mental time (associated with the experience of ‘now’ and the inner 
world). Many people have made this point (Weyl, 1949; Brain, 1960; Davies, 1985; Lock- 
wood, 1989; Smythies, 2003). Indeed, there is a huge philosophical literature on this topic 
and an ongoing controversy between the presentists and eternalists (Price, 1996; Savitt, 
2006; Earman, 2008). 

This also relates to the problem of free will. In a mechanistic Universe, a physical object 
(such as an observer’s body) is usually assumed to have a well-defined future world- 
line. However, one intuitively imagines that at any particular experiential time there are 
a number of possible future world-lines, as illustrated in Figure 2.5(b), with the interven- 
tion of consciousness allowing the selection of one of these. Admittedly, this choice may 
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be illusory but that is how it feels. The middle line in the figure shows the mechanis- 
tic (unchanged) future, while the other lines show two alternative (changed) futures. This 
implies that the past is fixed but that the future is undetermined. The failure of relativity to 
describe the process of future becoming past and different possible future world-lines may 
also relate to quantum theory. This is because the collapse of the wave function to one of 
a number of possible states entails a basic irreversibility. One way of resolving this is to 
invoke the ‘many worlds’ picture (Everett, 1957), which is reminiscent of Figure 2.5(b). 

Another relevant question is how the ‘beads’ of different observers are correlated. If two 
observers interact (i.e. if their world-lines cross), they must presumably be conscious at the 
same time (i.e. their ‘beads’ must traverse the intersection point together). However, what 
about observers whose world-lines do not intersect? Naively identifying contemporane- 
ous beads by taking a constant time slice, as illustrated by the broken line in Figure 2.5(c), 
might appear to be inconsistent with special relativity, since this rejects the notion of simul- 
taneity at different points in space. However, the notion of a preferred time is restored in 
general relativity because the large-scale isotropy and homogeneity of the Universe sin- 
gle out a special “cosmic time’, measured by clocks comoving with the expanding cosmic 
background. Even for an inhomogeneous cosmological model, preferred spatial hypersur- 
faces can be specified as having constant proper time since the Big Bang (Ellis, 2014). One 
also needs some concept of simultaneity at different points in space in quantum mechanics 
in order to describe the Einstein—Podolsky—Rosen (1935) paradox. The problem of recon- 
ciling relativity theory and quantum mechanics may therefore connect to the problem of 
understanding consciousness. 

One model for describing the flow of time and collapse of the wave function invokes 
a ‘growing block universe’. This is illustrated in Figure 2.6, which represents potential 
futures by dotted lines, some of which solidify as decisions or observations are made (cf. 
Ellis, 2014). Note that it is the transition between the four diagrams which corresponds 
to the flow of time. However, even this does not describe the flow of time because all the 
information is displayed in the last picture. It is just a block Universe with dotted lines, so 
we need some extra ingredient. 

One possible extra ingredient, proposed by the philosopher C. D. Broad (1953), is a 
second type of time (f2), or at least a higher dimension, with respect to which our motion 
through physical time (t;) is measured. This is illustrated in Figure 2.7(a), which represents 
the progression of consciousness in a 4D space as a path in a 5D space. At any moment in 
ta a physical object will have either a unique future world-line (in a mechanistic model) or a 
number of possible world-lines (in a quantum model). The intervention of consciousness or 
quantum collapse allows the future world-line to change or be selected from, respectively, 
as indicated by the solid lines in Figure 2.6. Since the future is not absolutely predetermined 
in this model, there is still a difference between the past and the future. As illustrated in 
Figure 2.7(b), at any point in f the past in ¢; is uniquely prescribed but the future is fuzzy. 

This interpretation of the flow of time is also suggested by the Randall—Sundrum pro- 
posal, illustrated in Figure 2.2, in which spacetime is regarded as a 4D brane embedded in a 
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Figure 2.6 Sequence of decisions in growing black universe, with past shown solid and possible 
futures shown dotted. 
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Figure 2.7 Describing flow of conscious time in a 4D structure with second time (a) and 
5D structure (b). 


5D bulk. In the simplest case, the brane corresponds to the flat spacetime of special relativ- 
ity. However, there is a cosmological version of this picture — called ‘brane cosmology’ — 
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in which the brane is curved and space is expanding (Maartens, 2004). The cosmic expan- 
sion can then be interpreted as being generated by the brane’s motion through the fifth 
dimension. My proposal identifies this fifth dimension with the extra dimension invoked 
to explain the flow of time. Although models with multiple times have also been proposed 
by physicists (Weinstein, 2008), it is not generally appreciated that the Randall—Sundrum 
picture may resolve a long-standing philosophical problem. 

The relativistic solution which describes the motion of the brane through the bulk is 
the 5D Schwarzschild—Anti de Sitter model (Bowcock et al., 2000; Mukohyama et al., 
2000). The coordinates within the brane are (7, 0, y, T) with T being cosmic time. The 
fifth coordinate R corresponds to the embedding dimension, with the brane being located 
at R = a(T) where a(T) is the cosmic scale factor. So the extra dimension is spacelike 
but implicitly related to T. This is why the brane can be regarded as moving through the 
bulk and why the extra dimension is naturally identified with hypersurfaces of homo- 
geneity (TJ = const.). The model has two parameters: the mass of the 5D black hole 
(m) and the 5D cosmological constant (A). The Randall-Sundrum brane has m = 0 and 
fixed a, so it is static and the background is de Sitter. If m ¢ 0, the metric takes the 5D 
Schwarzschild—Anti de Sitter form, with the fifth coordinate playing a role analogous to 
the ‘radial’ distance in the 4D Schwarzschild solution. The solution contains a black hole 
event horizon. This means that the Universe effectively emerges out of a 5D black hole 
and that the fifth coordinate becomes timelike at sufficiently early times. This might be 
contrasted with the proposal of Smolin (1997), in which the Universe emerges from a 4D 
black hole. 

Clearly, invoking a second time dimension only generates a global flow of time. It does 
not explain the sense of individual identity (i.e. the first person perspective). As discussed 
elsewhere, for this one needs to introduce the notion of a “specious present’ and I relate 
this to the presence of other compactified dimensions (Carr, 2015). However, this proposal 
obviously does not resolve all the philosophical problems associated with the passage of 
time (Price, 1996). 


2.8 Concluding Remarks on the Limits of Science 


In this chapter I have argued that the domain of legitimate science grows with time, partly 
because new data become accessible and partly because the nature of science changes. In 
concluding, I will focus on the latter possibility and try to identify the crucial features of 
science with particular reference to astronomy. I will end with some remarks about the 
relevance of mind to physics. 

It used to be assumed that science depends on experiments but astronomers cannot exper- 
iment with stars and galaxies. They can only let nature do this for them by observing 
billions of them in different states of evolution. Cosmology is in a worse state because we 
can only observe our own Universe and it is usually assumed that observability is essential 
to science. However, there are many entities posited by physics which cannot be seen (e.g. 
quarks and the interiors of a black hole); one just needs some aspects of a theory to be 
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observable for it to be regarded as scientific. Others have emphasised the importance of 
testability or falsifiability in science. But on what timescale should one demand this? Is it 
reasonable to deny that M-theory qualifies as science because it has not been vindicated 
experimentally after 30 years? It might take a hundred years but the definition of science 
should not depend upon how long a problem takes to solve. 

The question of whether M-theory and the multiverse are part of legitimate science is 
clearly unresolved at present. This is why I relegate the multiverse to the domain of meta- 
cosmology, leaving open the possibility that it will eventually be promoted to cosmology. 
Even if they are, there is one sense in which the current situation is very special. This is 
because for the first time the macro and micro physics/philosophy boundaries in Figure 2.1 
have merged, the very large and very small being unified through quantum gravity. The 
question then becomes: does this merging represent the completion of science or merely 
a transformation in its nature — of the kind to which Weinberg refers and which tends to 
happen with every paradigm shift (Kuhn, 1970)? 

Personally I favour the latter view and I will end by suggesting some possible fea- 
tures of the new type of science. Firstly, since we are in the domain of quantum gravity, 
one feature is likely to be a transcendence of the usual spacetime description. In par- 
ticular, the new paradigm may involve extra dimensions, this being a feature of both 
M-theory and some multiverse proposals. Another feature of the new science may be a 
more explicit reference to consciousness. The mainstream view is that consciousness has 
a purely passive role in the Universe. In fact, most physicists assume that it is beyond 
their remit altogether because physics is concerned with a ‘third person’ account of the 
world (experiment) rather than a ‘first person’ account (experience). They infer that their 
focus should be the objective world, with the subjective element being banished as much as 
possible. 

On the other hand, other physicists are sceptical of claims to be close to a TOE, when 
such a conspicuous aspect of the world is neglected. Thus Noam Chomsky (1975) declares 
‘Physics must expand to explain mental experiences’. Roger Penrose (1994) predicts “We 
need a revolution in physics on the scale of quantum theory and relativity before we can 
understand mind’. Andrei Linde (2004) suggests that ‘consciousness is as fundamental 
to the cosmos as space-time and mass-energy’. Many physicists disagree with this and 
regard the C word as just as taboo as the A word. However, in the last few decades there 
have been several hints from physics itself (e.g. the Anthropic Principle) that mind may be 
fundamental rather than incidental features of the Universe. 

I will end on a provocative note by returning to the image of the Cosmic Uroborus 
in Figure 2.1. Although we have known since Copernicus that we are not at the centre 
of the Universe geographically, Figure 2.1 suggests we are at the centre of the scales of 
structure. This is because simple physics shows that the size of a human — or any living 
being — is roughly the geometric mean of the Planck length and the size of the observable 
Universe. Figure 2.1 also encapsulates the triumph of physics in explaining the dazzling 
array of increasingly complex structures which have evolved in the 14 billion years since 
the Big Bang. Since the culmination of this complexity — at least on Earth — is the human 
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brain, whose remarkable attributes include consciousness, it is curious that this attribute is 
almost completely neglected by physics. The Uroborus also represents the way in which 
we have systematically expanded our outermost and innermost limits of awareness through 
scientific progress (i.e. it represents a blossoming of consciousness). Thus the physical 
evolution of the universe from the Big Bang (at the top of the Uroborus) through various 
stages of complexity to humans (at the bottom) is just the start of a phase of intellectual 
evolution, in which mind works its way up both sides to the top again. Perhaps it is not 
inconceivable that the marriage of relativity theory and quantum theory at the top of the 
Uroborus will involve mind in some fundamental way. 
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Moving Boundaries? — Comments on the Relationship 
Between Philosophy and Cosmology 


CLAUS BEISBART 


3.1 Introduction 


A popular account of the development of science, in particular of cosmology, and of the 
relation between science and philosophy goes like this: Science is inherently progressive 
and ever extends our knowledge. Step by step, phenomena at smaller and larger scales 
become known to us. Philosophy, by contrast, is concerned with those questions that we 
cannot (yet) answer in a scientific way due to a lack of empirical evidence. Accordingly, as 
science progresses, the boundary between the sciences and philosophy shifts. As a result, 
philosophy continuously loses ground, thus becoming more and more marginal — or so the 
view is. The aim of this comment is to discuss this account of science and philosophy with 
a special emphasis on cosmology. 


3.2 A Popular Story About the Development of Science and Philosophy 


Let us first unfold the popular story in more detail. There is quite a lot to support it. Early 
attempts to account for the natural world and to explain the observed phenomena, e.g. 
in ancient Greece, were fraught with speculation. There was simply no alternative, since, 
in modern terms, background knowledge was small and data sparse. Accordingly, to the 
extent that there was natural science, it was speculative and in this sense philosophical. 
Philosophy and the natural sciences were not, and could not be, properly distinguished, 
as is evident e.g. from the fact that a renowned philosopher, viz. Plato, wrote a dialogue 
concerned with what we now call cosmology. 

Given the lack of knowledge and data, it is no surprise that there were rival views on 
how the world is composed and how far it extends. As to the small scales, there was contro- 
versy as to whether the natural world is built out of tiny, indivisible bodies, as the atomists 
claimed. There were competing views about the largest scales too: Whereas the atom- 
ists favoured an infinite world, the school that gained predominance for quite some time 
claimed that the Universe was finite and consisted largely of what is now considered to be 
the solar system (consult Kragh (2007) for a history of cosmology). 

After the so-called Scientific Revolution, and particularly during the twentieth century, 
our knowledge of the natural world has rapidly increased. The hypothesis that bodies are 
made up of a fairly small number of elements gained momentum after the works by John 
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Dalton. Around 1900, the hypothesis that matter is made of atoms that were invisible at 
the time gained support because it could provide detailed accounts of various experiments. 
In 1911, Ernest Rutherford proposed a model according to which atoms in turn consist of 
nuclei and electrons. Soon after, it was found that the nucleus is composed of protons and 
neutrons. According to the standard model, which is the basis of contemporary research in 
elementary particle physics, protons and neutrons consist of quarks. We are now talking 
about scales well below 107!> m 

Cosmology is intimately related to progress in the opposite direction. To recap the his- 
tory briefly: During the Scientific Revolution, geocentrism was sacrificed in favour of 
heliocentrism. Galileo Galilei discovered that the Milky Way is composed of stars too, 
giving rise to the idea that we live in a huge, but flattened system of stars. Immanuel Kant 
and others suspected that there are other galaxies of this type; a hypothesis that was con- 
firmed when the Great Debate about nebulae was decided. In our days, large-scale structure 
surveys such as the Two-degree Field project and the Sloan Digital Sky Survey (SDSS) red- 
shift surveys map millions of galaxies. Objects have been observed that are of the order of 
10*> m away from us. A recently very fashionable hypothesis concerns scales even larger 
than the observable Universe; it has it that the Universe is only part of a huge Multiverse 
of Universes that differ in their values of important parameters. ! 

No doubt then that the Universe of scientific knowledge has been expanding in both 
directions and that we can now zoom in on the microcosm and the macrocosm in some 
detail. And it is not just the case that our knowledge extends to spatial scales that are ever 
more remote from our own; rather, our knowledge regarding temporal scales has also been 
expanding; and we can now tell a detailed story about how the observable Universe evolved 
from something like 13.8 billion years ago. 

Philosophy, by contrast, has not been progressive in this way. There is no realm of things 
with respect to which philosophy has attained more and more knowledge in the way physics 
did. Moreover, scientific findings have step by step cleared up what seemed to be philo- 
sophical grounds; a lot of claims that philosophers have issued about the structure of the 
material world, e.g. about the orbits of planets, have been proved false. Other speculations 
such as the views of the atomists may seem visionary and pioneering in the light of modern 
findings. Nevertheless, the extent to which previous philosophical views about the struc- 
ture of the material world hold true does not turn on the philosophical merits the views 
were supposed to have; it seems rather a lucky coincidence if philosophical views accord 
with the findings of modern science. And today, no philosopher would dare to develop 
views about the macro or the micro-world without heavily leaning on science; every other 
approach would seem crazy. 

So much for the popular story. A variety of it is given in this book by Bernard Carr (see 
Chapter 2). Other physicists subscribe to something like this too. For instance, Hawking 
and Mlodinow (2010, p. 13) admit that questions about the Universe were once discussed 


! See e.g. North (1965), Kanitscheider (1991) and Kragh (2007) for the history of cosmology; see Mason (1962) 
for a general history of the sciences. 
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by philosophers. As to our own time, their diagnosis is: ‘Philosophy has not kept up with 
the modern developments of science, particularly physics. Science has become the bearer 
of the torch of discovery in our quest of knowledge’ .” 


3.3 How May the Story Continue? 


If the development of physics is correctly pictured in this account, then a natural question 
to ask is: Will the story go on in the way it has been evolving thus far? Will physicists 
continue to extend our knowledge to ever smaller and larger scales? Will physicists find 
out that the particles that they take now to be elementary are composed of even smaller 
particles? Likewise, concerning large scales, will the Multiverse hypothesis stabilise to 
scientific knowledge, and, if so, will the Multiverse later be found to be part of an even 
larger I-don’t-know-what-verse? 

There are two ways in which the story may come to an end, concerning small or large 
scales, respectively. The first alternative is that physicists hit the limits of being. Maybe, 
certain particles or entities of somehow different types are simply elementary and are not 
composed of anything else, and the laws that hold about these entities are the foundations 
of the whole of physics. Likewise, concerning large scales, maybe, everything that exists is 
indeed confined in certain scales. If the respective scales have become known to physicists, 
there is no further progress to be made. The only question then is whether we can ever know 
we have reached the limits of being? Can we know that there is nothing above or below 
the scales that we have learned about up to this point? If the answer is yes concerning both 
scales, the exploration of ever smaller and larger scales will have come to its definite end. 
This will not be the end of physics though, because a huge part of physics has always been 
concerned with discovering, describing and explaining yet unknown phenomena at scales 
between the smallest and largest known scales. There is no reason to expect that a theory 
of everything exonerates physicists of this task. 

The other alternative is that physicists reach the limits of what humans can know before 
they reach the limits of being. Maybe, certain scales are simply beyond what we can ever 
grasp, describe and explain. Under this alternative, the question is whether we can ever 
know that we have reached the limits of what we can know. 

The question of whether the scientific endeavour to study ever smaller and larger scales 
may come to a halt is at the centre of an important part of the ‘Critique of Pure Reason’ by 
Immanuel Kant (see particularly A405—A591, B432—B619). For Kant, progress in all sorts 
of directions is demanded by reason: It is the task of reason to find the conditions of those 
things already established (A508 f./B536 f.). The problem though is, according to Kant, 
that reason does not content itself with stepwise progress; rather it develops views and argu- 
ments that try to anticipate whether or not progress will go on for ever and whether there 
are limits of being (e.g. A VII f. and A408-420/B435-448). As a result, reason is caught 
in a contradiction because there seem to be decisive arguments both in favour, and against, 


2 See below for a related quotation by Hawking. 
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the view that there is an infinite chain of conditions. The deeper reason for the problems 
is that reason leaves the limits of any possible experience if it tries to infer whether the 
chain of conditions is infinite or not (Kant, 1781). The search for ever larger scales is 
covered by what Kant calls the first antinomy (A426—433/B454—461). In Kant’s words 
the question is whether the world has an infinite spatial extension. The search for ever 
smaller particles is discussed in what Kant calls the second antinomy (A434—437/B462— 
B465). In Kant’s terms the crucial question is whether substances consist of simple 
parts. 

Kant’s main target is a metaphysical cosmology, i.e. an attempt to answer questions 
about the basic structure of reality in terms of a priori reasoning. But his conclusions 
extend to the empirical sciences because he thinks that the questions cannot be answered 
by the sciences, either. He argues that, as far as the world of experiences goes, there are 
no definite answers as to whether substances are composed of simple parts and whether 
the world is infinite (A502—507/B530-535). This implies that physics will never come to a 
point that is known to be the end concerning either large or small scales. Kant’s discussion 
in the ‘Critique of Pure Reason’ is thus challenging in view of the natural sciences too. 

Kant’s terminology may strike us as old-fashioned metaphysical parlance, and, even 
worse, the way he has formulated some of his questions presupposes a physics that is 
now outdated. For instance, Kant assumes that space is infinite, which need not be the 
case according to modern physics. Kant’s discussion thus needs a thorough analysis. It 
may nevertheless turn out that he has a point. In my view, he ultimately poses important 
questions: What would it mean to say that we arrive at a point at which our knowledge 
covers every scale? And how can there ever be scientific evidence that there is no more to 
come? 


3.4 Have We Already Reached the Limits of What We Can Know? 


However, we may think of these questions and of limits of being and possible knowledge, 
already up to now the exploration of smaller and larger scales has become increasingly 
difficult. The reason is of course that we move away from those scales that we can directly 
observe. A possible worry thus is that the story we are discussing is not exactly right. 
Maybe, the alleged findings about very small and large scales do not really qualify as 
knowledge. This certainly seems to be the case concerning the Multiverse at the largest or 
concerning String Theory at the smallest scales; we are here talking about no more than 
hypotheses, which may be attractive on scientific grounds but which do not yet qualify as 
knowledge. But the point may even extend to scales that are closer to us. Maybe, we do not 
really know that there are quarks or electrons or that there are billions of other galaxies. 
Protons and quarks are only detected using instruments the operation of which is built upon 
a great number of assumptions. The identification and description of very distant objects 
such as high-redshift galaxies and quasars draws on theories and assumptions too. The 
Multiverse hypothesis even goes beyond what we may observe according to the General 
Theory of Relativity because it postulates a plurality of Universes from which no signal can 
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yet have reached us. If the hypothesis is accepted, then that is only because it is assumed 
to explain some features of our observable Universe. 

Philosophy of science features an interesting debate as to how seriously we should take 
the findings that are now accepted by scientists. A particular focus is on the question to 
what extent science has brought us knowledge about unobservables. Whereas realists take 
it that physics has given us knowledge about scales much beyond what we may observe, 
some non-realist positions such as Bas van Fraassen’s constructive empiricism deny this 
(van Fraassen, 1980). Interestingly, the debate is almost exclusively focused on small 
scales, thus neglecting the realm of cosmology. One reason is that an influential notion of 
observability from the philosophical literature (van Fraassen, 1980, p. 16) abstracts from 
our spatio-temporal position and thus does not take into account limitations that derive 
from so-called horizons (see below). But some of the worries of non-realists refer to large 
scales too. One worry is particularly important: The available data or even the data of which 
we may avail ourselves may underdetermine the choice between theories and hypotheses, 
simply because several hypotheses are compatible with the data. If this is so, it seems that 
we cannot know which of the hypotheses is true. 

Whether this type of argument is sound depends on two questions: First, are there alter- 
natives to what we believe about small or large scales, or do we at least have reasons to 
assume that there are such alternatives that comply with all data? Second, if there are sev- 
eral hypotheses that are compatible with the same available data, are there nevertheless 
good reasons to embrace one hypothesis as the true one, e.g. because it has much more 
explanatory power than its rivals? 

Let us briefly discuss the questions for both small and large scales. At small scales, a lot 
of findings are uncontested, e.g. that there are six types of quark, as the Standard Model 
of particle physics has it. There is of course the question of whether quarks are composed 
of other things or whether they are certain strings, but we may say that these are simply 
questions about the composition and properties of quarks, and that the openness of these 
questions does not put into doubt what we do already know about quarks. But there is in 
fact a problem here because the particles from the standard model are described in terms of 
quantum mechanics. This theory has puzzled philosophers and physicists for quite a while, 
a main question being what the theory really tells us about the world. There are various 
so-called interpretations of quantum mechanics. According to Bohmian mechanics, for 
instance, the ultimate quantum objects are particles that are not so different from classical 
particles, except that they follow a dynamics that is different from classical mechanics (see 
Goldstein (2013) for an introduction). Other interpretations have it that there is simply the 
wave function in configuration space (Ney and Albert, 2013) or that quantum mechanics 
describes a multitude of classical branches, as the so-called many worlds interpretation has 
it (for an introduction see Vaidman (2015))°. Since quantum mechanics provides the con- 
ceptual framework of the standard model, the problem is that we do not really know what 
any quantum-mechanical particle, be it a quark or an electron, is: Are we talking about 
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classical particles, are the so-called particles really just an aspect of the wave function, 
etc.? Note that we are here not just concerned with some features of electrons, quarks, 
etc., but rather with what they really are. For the most part, the different interpretations of 
quantum mechanics cannot be distinguished in terms of possible observations. This does 
not mean though that there are equal grounds to accept each of the hypotheses. There are 
now intensive debates as to which interpretation is to be preferred; the debates are much, 
but not exclusively, advanced by philosophers. That philosophers engage in these debates, 
which cannot be decided in terms of observations, is of course no coincidence; it accords 
well with the account of scientific progress that we are discussing. And it need not be the 
case that the debates can never be decided in a reasonable way; it may turn out that most 
reasons speak in favour of one interpretation. Note in particular that quantum mechanics 
itself may be further developed by physicists, thus facilitating the interpretation. Neverthe- 
less, right now, the choice between the different interpretations of quantum mechanics is 
undetermined by data and very likely also by other reasons, to choose a particular inter- 
pretation. If this situation continues, then the picture we are discussing is over-optimistic 
because we have hit the limits of what we can know. 

For large scales, there is underdetermination too. General results show that our obser- 
vations are compatible with various markedly distinct space-time structures (Manchak, 
2009). Very roughly, the problem is that, according to the theories of relativity, transmission 
of physical information is restricted to the speed of light. This means that every observer 
in a space-time has a horizon and can only have obtained signals from some part of the 
space-time. Since the observable realm fits many possible space-times, various space-time 
models are compatible with the observations that an observer could have made. 

The crucial question then is whether there are other reasons to argue in favour of one 
or the other space-time. No physicist would dare to commit herself to a particular space- 
time model, but e.g. Multiverse theorists make at least some claims that stretch beyond the 
scales we can observe. One idea might be that explanatory concerns provide justification 
for doing so: A particular account of the way of explaining what we observe may be so 
convincing that we take it to represent the Universe even at larger scales. There are prob- 
lems though with this strategy, e.g. because at least some of them are very expensive in 
terms of ontology (see Beisbart (2009) and Ellis (2014)). If the problems cannot be solved, 
then a permanent underdetermination prevents us from knowing scales well beyond our 
horizon. 


3.5 What Significance Do Shifting Standards of Science Have for the Story? 


One possible way to get around this problem seems to be to say that science has been evolv- 
ing and continues to evolve and that the standards of science change during this evolution 
too. One then can admit that e.g. the Multiverse hypothesis may not qualify as scientifi- 
cally established by standards that were once held to be scientific, but this would not matter 
too much if the Multiverse hypothesis complies with those standards that are postulated in 
some future. 
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This move is based upon correct observations about the history of science. To begin 
with the obvious, science has certainly changed in that the content of what was counted 
as scientific knowledge has altered. The point applies to cosmology in particular, because 
our conception of the Universe has changed a lot. Of course, in a purely formal sense, the 
Universe has always counted as everything there is in terms of matter. But the views about 
what this Universe is have changed tremendously during centuries. 

Moreover, as e.g. historian and philosopher of science Thomas S. Kuhn has stressed 
(Kuhn, 1962), along with the scientific findings, the standards with which the sciences are 
required to comply have changed. The idea is that basic hypotheses about a field and the 
standards come in a package. For instance, if you think that all material bodies are nothing 
but atoms, you will particularly value a special type of explanation, viz. micro-explanation, 
which accounts for some phenomena by drawing on the underlying atoms, their proper- 
ties and mutual forces. If you disagree with the atomistic picture, your standards on good 
explanations are likely to differ. 

But the assumption that the standards of scientific work can change creates its own 
problems. Suppose that we have to choose between two rival theories that each suggests 
its own standards. By what standards can we decide which theory to choose? If there are 
no standards that are not theory-relative, then each theory may seem valuable by its own 
standards, without there being any rational way to choose among the theories. Kuhn is 
famous for drawing such a conclusion, but this conclusion was to a large part received with 
hostility. Resistance against Kuhn’s conclusion was not just based upon wishful thinking, 
but also on arguments that attempted to show that certain theory changes are well supported 
by quite uncontroversial reasons. 

The lessons for our purposes are clear enough: If some people doubt that scientific 
progress has indeed yielded knowledge about one or the other part of the Universe, it 
will not do to insist that the standards of scientific research are subject to change and that 
the findings that are doubted by some qualify as scientific knowledge according to current 
or, maybe, future standards. Rather, a case should be made that the changes concerning 
the standards which are appealed to are superior to rivals e.g. because they reflect a better 
understanding of some parts of the world or because they have proved fruitful. It is doubt- 
ful whether such a case has yet been made concerning String Theory or the Multiverse 
hypothesis (but see Dawid (2013) for an interesting discussion about the epistemological 
underpinnings of String Theory). 


3.6 Is No Task Left for Philosophy? 


Things are different if we turn to results that have withstood critical scrutiny for quite some 
while. It is plausible to say that results about quarks and electrons or about the evolution 
of the observable Universe have much improved our knowledge about the micro and the 
macro-cosmos, respectively (this point holds true even if we do not know what quantum 
particles really are). Let us thus assume that, to a significant extent, the story under investi- 
gation has it right for the evolution of physics. What are the consequences for philosophy? 
Is the implication that it has lost significant grounds and that nothing is left over for it? 
Hawking (1992, pp. 174-5) writes: 
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In the eighteenth century, philosophers considered the whole of human knowledge, includ- 
ing science, to be their field and discussed questions such as: Did the Universe have a 
beginning? However, in the nineteenth and twentieth centuries, science became too tech- 
nical and mathematical for the philosophers, or anyone else except a few specialists. 
Philosophers reduced the scope of their inquiries so much that Wittgenstein, the most 
famous philosopher of this century, said, “The sole remaining task for philosophy is the 
analysis of language.’ What a comedown from the great tradition of philosophy from 
Aristotle to Kant! 


Is Hawking right? The answer is clearly No. There is clear historical and systematic 
evidence for this answer. 

Turn first to the historical evidence. Hawking is of course right that the origin of the 
Universe was an important theme for Kant, but is not any more for most twentieth century 
philosophers. But to use Kant as an example, he was thinking about a lot of other questions 
too. In his ‘Critique of Pure Reason’ he assumes that metaphysics is concerned with other 
questions such as: Is the human soul a simple substance? Are we free to choose? Does 
God exist (e.g. B7)? But in the Critique, his project was not to answer these questions. 
Rather, Kant urged a critical philosophy that first examines to what extent human beings 
can answer the questions of metaphysics (B22 f.). And, as a matter of fact, he argued that 
human beings would never be able to answer the questions because they transcend any 
possible experience. His own philosophy in the Critique thus is a thorough investigation 
of human reason and human knowledge; the same is true for many other classic works in 
philosophy, e.g. the writings of John Locke and David Hume. This type of work is not in 
any way affected by the progress of physics (even though some of this work is now related 
to research carried out in cognitive science). More generally, questions about the micro 
and the macrocosm or about the origin of the world define one, but not an exclusive or 
predominant occupation of philosophers during the history of philosophy. 

Turn now to systematic reasons for why there are still lots of grounds for philosophy. It 
is sometimes said that cosmology is about everything there is (e.g. Peacock, 1999, p. xi), 
but this statement is at best premature, if not misleading or false. Cosmology as a physical 
science is concerned with the spatial and temporal distribution of material stuff. As such, 
it does not deal with e.g. normative issues, with mental phenomena, e.g. consciousness, 
or with the existence of universals. Cosmology and physics, as they are currently done, 
do not answer the question of what we should do and what are good reasons to believe 
something. They are not concerned with the understanding of consciousness. And the ques- 
tion of whether there are mind-independent universals, e.g. natural kinds or properties, is 
orthogonal to cosmology too. 

It may eventually turn out that everything that we might want to say about normative 
questions and about consciousness can ultimately be reduced to physical theories. It may 
further turn out that there are simply no universals or that they are part of physics. If this is 
so and if a couple of other things we are talking about do not have an existence that can be 
fully described in terms of physics, then cosmology is about everything. But whether this 
condition is fulfilled, is up for philosophical debate. 
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Questions about consciousness or the existence of universals may seem a bit far-fetched; 
but note that there are more mundane questions that are currently discussed in philosophy. 
These questions are actually posed by physics. One has already been mentioned above, viz. 
the question of what quantum mechanics implies about the basic structure of the material 
world. Another question is how we should think of space and time. Time in particular is 
hard to understand: It is a basic experience that time passes, but it is notoriously difficult to 
make sense of this in terms of physical theories. Physics does not fully deal with questions 
like these; it is rather philosophers of physics who discuss them. 


3.7 What Sort of Knowledge Do Philosophers Gain? 


It is a fact that there is now a division of labour between physicists and philosophers of 
physics. Philosophers do not investigate the redshifts of galaxies, and at least most physi- 
cists do not address the question of how time passes. But this division of labour raises 
a question: If philosophers think about space and time e.g., how are they entitled to their 
claims? What are the sources of knowledge upon which they rely? How can they ever argue 
for their conclusions without drawing on experience? And if they only draw on experience, 
what is the point of claiming that there is a legitimate role for a philosophy of physics quite 
apart from physics itself? 

This question raises deeper issues, e.g. whether there are brands of knowledge that are 
not empirical. There are a number of respectable suggestions for such knowledge, e.g. the 
Kantian proposal that some knowledge is not empirical because it names the necessary 
conditions of experience. But this is not the place to discuss this and other proposals. I will 
only raise a weaker point: It is not so clear how far our experience extends. Can we still say 
that we know by experience that there are atoms? And are there empirical reasons to think 
that the Universe is only part of a larger Multiverse? If it is not clear how far our experience 
goes, then why require that philosophy be based upon something other than experience? 

In his passage quoted above on p. 72, Hawking complains that some philosophers have 
contented themselves with the clarification of concepts. His remarks echo proposals by the 
logical positivists and others. The positivists claimed that there are only two varieties of 
knowledge, viz. empirical and purely conceptual knowledge. Quine, in his ‘Two Dogmas 
of Empiricism’, famously claimed that the distinction cannot be drawn in the way sug- 
gested. He proposed a picture in which there is a web of beliefs that somehow latch onto 
experience. Some of our beliefs are very close to what we experience, others are more 
remote. The former are part of the natural sciences; the latter may become more philo- 
sophical (Quine, 1951). If this is correct, then there is not a sharp distinction, but rather 
a continuous transition between the natural sciences and philosophy. Whatever we may 
think of this picture in general, it seems quite appropriate for a huge part of modern philos- 
ophy of physics. The philosophy of physics does not aim at a variety of knowledge that is 
completely independent of experience. It does draw on experience, but it is concerned with 
more general issues than physics is. The distinction between physics and its philosophy is 
mainly pragmatic, deriving from the fact that physics has become so complicated that a 
division of labour seems appropriate. 
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3.8 Conclusion 


To summarise: In this commentary, I have raised a number of questions about a common 
picture of how physics, particularly cosmology, on the one hand, and philosophy, on the 
other hand, have evolved and how they are related to each other. I have raised a number of 
questions about this picture. Some of the questions point to difficulties about the picture 
and reveal that it is too optimistic. I have not answered all my questions; but the discussion 
should at least have pointed to the presuppositions of the picture. A lot of questions are left 
open then — many for the philosophy of physics, in particular of cosmology. 
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On the Question Why There Exists Something 
Rather Than Nothing 


RODERICH TUMULKA 


4.1 Introduction 


In my opinion, nothing useful has ever been written on the question in the title, and small 
is the contribution that I have to offer. I outline an explanation for why there is something 
rather than nothing, an explanation which, however, I believe is incorrect because it makes 
a certain empirical prediction (absence of qualia) that is incorrect. Nevertheless, it may 
be interesting to discuss this reasoning. It allows, in principle though not in practice, to 
derive the laws of nature and all physical facts about the universe. Then I elucidate which 
objections to this explanation are, in my opinion, valid and which are not. 

Explanation in physics usually works this way: observable phenomena get explained by 
physical theories. A physical theory is the hypothesis that the physical world consists of 
certain kinds of physical objects governed by certain laws. From this hypothesis we derive, 
or we make it plausible that it can be derived, that the phenomenon in question (typically) 
occurs; then we say that the theory explains the phenomenon. The physical theory does not 
explain why these kinds of physical objects exist, why others do not, and why these laws 
hold; instead of explaining them, the theory merely posits them. At best, some theories are 
simpler and more elegant than others (e.g. Einstein’s general relativity more than Newton’s 
theory of gravity). But no physical theory comes close to explaining why these laws hold, 
or why there exist any physical objects at all. Thus, no physical theory contributes to the 
question in the title. 

Specifically, some physical theories allow for the possibility of a vacuum state, i.e. that 
at a certain time there is no matter in space, and for the possibility of a transition from 
a vacuum state to a non-vacuum state, i.e. that at some other time there is some matter in 
space. While such a theory has some explanatory value, it does not touch upon the question 
in the title, as it does not explain the physical laws, nor why space-time exists,! nor why 
certain kinds of matter (described by certain kinds of mathematical variables) exist and 
others do not. For related discussion, see Holt’s overview [4] and Albert’s critique [1] of 
Krauss’s book [5]. 


! T also call space-time, not only pieces of matter, a physical object. 
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In this chapter I outline a novel type of explanation of why anything exists. Obviously, 
the reasoning is very different from usual physical theories. The explanation also aims to 
explain why the world is the way it is. In particular, it allows in principle to derive the laws 
of physics and other testable consequences. I will answer objections that I think are not 
valid, and I will argue that the explanation I describe, although it is a coherent reasoning, 
ultimately fails for a reason connected to the mind—body problem (see Chalmers, 1996 [3] 
and references therein): it predicts the non-existence of qualia. 

Even if the explanation fails it may be worth considering because there are, as far as 
I am aware, no other explanations for why anything exists. Perhaps, the closest that any 
reasoning before came to such an explanation was Anselm of Canterbury’s (1033-1109) 
ontological argument for the existence of god [2], endorsed by Leibniz (see Loewer, 1978 
[6] for a critique). I believe that Anselm’s argument is not coherent and mine is; but I will 
not discuss Anselm’s argument here. 

I have the sense that many scientists and philosophers think that the question in the 
title is meaningless and cannot be answered. On the contrary, I think that the question is 
meaningful, as I understand its meaning, and the explanation I describe suggests to me that 
a rational answer might be conceivable. 


4.2 The Explanation 


The reasoning begins with the facts of mathematics. There is no mystery about why they 
are true. Their truth lies in the nature of mathematics and is explained by their content. 
Even if the physical world had been created by a god, the facts of mathematics would have 
been facts prior to that act, that is, it would not have been within the god’s power to change 
these facts. Along with the mathematical facts come mathematical objects, such as the 
empty set, the set containing only the empty set, the ordinal numbers (an ordinal number is 
a set a of sets that is well-ordered by the € relation and €-transitive, i.e. contains each of its 
elements as a subset), the natural numbers regarded as the finite ordinal numbers, and the 
things that can be constructed from the natural numbers. I take it that mathematical objects 
exist in some sense. Not in the same way as physical objects, but in the way appropriate 
for mathematical objects. 

Now suppose, for the sake of the reasoning, that mathematical objects could naturally 
be arranged in a way that looks like a discrete space-time, say a discrete version of a 
four-dimensional manifold. For example, it can be argued that all mathematical objects 
are ultimately sets; and all sets naturally form a directed graph with the € relation as the 
edges; and all sets naturally form a hierarchy according to their type.” A graph is somewhat 
similar to a discrete space-time; indeed some proposed definitions of discrete space-time 
say that such a thing is a partially ordered set — which is more or less a directed graph. The 
type has certain similarities with time (see Remark 6. in Section 4.3). The example breaks 
down at some point, however, since I do not see any similarity between the sets of a given 


2 The type of a set [7] is an ordinal number such that a set of type a contains only sets of type smaller than a. 
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type and three-dimensional space. But suppose that the mathematical objects did naturally 
form a discrete space-time /. 

Suppose further that there is a property M that some mathematical objects have and oth- 
ers do not, and which bears a natural similarity to the property of being matter (I will come 
back to this in Remark 6. below). Then some space-time points in .W@ will be ‘occupied by 
matter’ (in the sense of having the property M) and others will not. They may form world- 
lines or other subsets of .@. Thus, in a certain way of looking at the world of mathematics, 
it appears like a space-time with matter moving in there. For example, think of all mathe- 
matical objects as being sets, take .W again to be the graph of all sets with the € relation, 
and let M be the property of being an ordinal number. That is a simple property that some 
objects have and others do not. If we think of type as time then, since there is exactly one 
ordinal for every type, at every time there is exactly one space-time point containing mat- 
ter — a particle world-line. The example is of limited use since it represents a space-time 
with a single particle. But suppose that for some natural property M, the pattern of matter 
in space-time is more complicated, more like patterns in the physical world. 

Suppose further that in .@ and in this pattern of matter, there are some sub-patterns 
which, if they actually were the patterns of physical matter in a physical space-time, would 
constitute intelligent beings. Then I feel that we are justified to say that in some sense these 
beings actually exist, since they are made of the elements of .#, which do exist. Let me 
call these beings the mathians. To the mathians, .# must appear like a physical space-time, 
as it is the space-time in which they live, and the objects with property M must appear like 
matter, as it is the matter of which they are made. 

If the mathians ask what the physical laws of their space-time are, their question can 
be answered by pure thinking, by pure mathematics. In fact, every historical fact of their 
universe is determined by pure mathematics, although the mathians may not be able to 
carry out the thinking needed to determine these facts. 

If the mathians ask why their universe exists, then their question can indeed be answered. 
The correct answer is that .@, and the pattern of matter in .@ and the mathians, exist 
because they are mathematical objects, which exist by their nature. If the mathians ask 
why the matter in their universe is distributed in a certain way, then their question can be 
answered. The correct answer is that the distribution of matter in their universe follows 
logically from the mathematical facts, which are necessarily true. It seems plausible to 
me that the mathians may or may not be aware that they are the mathians, and that their 
universe is .@, and that there is a reason for why their universe exists and looks the way it 
does. Or some mathians may be aware and others may not. I see no reason why all mathians 
would have to be aware that they are mathians. 

Now suppose that we are mathians, and that our space-time is .@. Then there is an 
explanation to why our universe exists, and to why matter is distributed in a certain way 
in it. That is the explanation I promised. Let me call it the i-explanation. The i-explanation 
is that the matter of which we consist ultimately consists of mathematical objects, and the 
facts of our universe are ultimately mathematical facts; and there is no mystery to why 
mathematical objects exist or to why mathematical facts are true. 
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4.3 Remarks 


1. The statement that the matter of which we consist ultimately consists of mathematical 
objects may sound similar to statements such as that (i) a particle is a unitary repre- 
sentation of the Lorentz group, or that (ii) reality according to quantum field theory 
consists of field operators, or that (iii) reality according to quantum physics consists of 
pure information (‘it from bit’). But it is not similar. 

Statement (i) really means, I take it, that the possible types of physical particles in our 
universe correspond to the unitary representations of the Lorentz group; not that an indi- 
vidual particle literally is one of these representations. According to the i-explanation, 
in contrast, an individual particle (if our universe contains point particles) at a particular 
point in time actually is a certain mathematical object. This object, by the way, is not as 
simple and beautiful as a unitary representation of the Lorentz group; rather, it must be 
an exorbitantly complicated mathematical object that is unlikely ever to be individually 
considered by a mathematician because there are so many other objects with similar 
properties. 

Let me turn to statement (ii). Its meaning is unclear to me, as I do not understand how 
operators can be a mathematical description of matter. I do understand, in contrast, how 
a subset of space-time can be a mathematical description of matter, and that is what the 
i-explanation provides. 

Statement (iii) means, I take it, that physical facts are not objective but exist only if 
observed by intelligent beings. According to the i-explanation, in contrast, there are 
objective physical facts, whether observed or not, namely that space-time is given by 
M and that matter is located at those space-time points with property M. 

2. There are two ways in which mathematical objects and facts are relevant to the phys- 
ical world according to the i-explanation. First, they may form part of the physical 
world, and second, they may apply to the physical world. For example, suppose .@ 
was the graph of all sets with the € relation. The number 5, understood as an ordi- 
nal number, is a set and therefore a space-time point in .@. However, the number 5 
may also apply to other space-time regions, for example because there are 5 particles 
in a certain space-time region, i.e. there are 5 sets with property M in a certain family 
of sets. We are familiar with the second role of mathematical objects but not with the 
first. 

3. Needless to say, the i-explanation does nof claim that every physical theory (say, New- 
tonian mechanics) is realised in some (part of a) physically existing universe, in contrast 
to an idea proposed by Tegmark [8, 9]. For example, any particular conceivable universe 
governed by Newtonian mechanics is a mathematical object, and thus corresponds to a 
space-time point rather than to the whole universe or a region therein; in particular, the 
universe .@ is not governed by Newtonian mechanics. 

4. It is useful to distinguish between a specific i-explanation and the i-framework. To 
specify an i-explanation, it is necessary to specify how to arrange the totality of mathe- 
matical objects as a discrete space-time, and to specify the property M. Put differently, 
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it is necessary to specify an isomorphism, or identification, i between the mathematical 
world and the physical world. I have not specified 7, so I have actually not provided an 
i-explanation. I have only outlined how such an explanation would work if i could be 
provided. That is, I have described the i-framework. 

As long as no candidate for i has been specified, it does not perhaps seem particu- 
larly likely that such an isomorphism ij exists. The failure of the example involving the 
graph of all sets with the € relation, and my failure to come up with a better example, 
make it seem even less likely. Yet, it seems conceivable that i exists, and it seems that 
if i exists then the i-explanation does explain why there exists something rather than 
nothing. 

If i exists and we know what it is then it is possible, at least in principle, to derive all 
laws of physics. It is also possible in principle, in this case, to determine all historical 
facts, past and future, unless there are limitations to the mathematical facts we can 
find out about. However, it is conceivable that the computational cost of finding out 
about interesting historical facts by analysing the corresponding mathematical facts is 
prohibitively expensive, and in particular that finding out facts about our future, such as 
next week’s stock prices, by means of a computation of the corresponding mathematical 
facts necessarily takes longer than it takes those facts to occur (in the example, longer 
than one week). 

It follows in particular that any specific candidate for i can be, at least in principle, tested 
empirically, as it entails a particular distribution of matter in space-time that can be com- 
pared to the one in our universe. But even without such test results, some candidates for 
i may be more persuasive, or more attractive, than others on purely theoretical grounds. 
For example, we may judge candidates for i by their elegance and would require, in 
particular, that M be a very simple property: If its definition were enormously compli- 
cated, then that would cost the explanation much of its explanatory value. Furthermore, 
an i-explanation would seem more compelling if M could be argued to be not any old 
mathematical property but a very special one, one that plays a crucial mathematical 
role. For example, the property of a set to be both well ordered by € and €-transitive 
is a very special property, as testified to by the fact that mathematicians have a name 
(ordinal number) for such sets; basically, this property is special because well-orderings 
play a crucial mathematical role in set theory. 

Furthermore, an i-explanation would seem more compelling if analogies can be 
pointed out between M and the physical property (of a space-time point) of being occu- 
pied with matter. In the same vein, an i-explanation would seem more compelling if 
analogies can be pointed out between -W and space-time. For example, if W is the 
graph of all sets directed by the ¢€ relation, I can point out that the concept of type of a 
set [7] bears some analogies with the concept of time: Both are linearly ordered, and, 
in fact, since the type governs the order in which sets must be defined so as to allow 
only well-defined objects as elements when forming a set, it seems natural to think of 
the type as a kind of ‘logical time of set theory’. 
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4.4 Objections 


Objection: To the question of why anything exists, most of the argument is irrelevant. 
Already in the first paragraph of Section 4.2, it was claimed that mathematical objects 
exist. Therefore, something exists, and the remainder of the argument did not contribute to 
answering the question in the title. On the other hand, the first paragraph of Section 4.2 did 
more or less take for granted that mathematical objects exist, but did not explain why they 
exist, and did not answer the question in the title. 


Answer: There are two relevant senses of existence: mathematical and physical. The title 
refers to physical existence. The first paragraph of Section 4.2 refers to mathematical exis- 
tence. Why something exists mathematically is easy to explain: Mathematical objects do, 
as soon as they are conceivable. Why something exists physically is hard to explain. In 
particular, physical objects do not exist physically as soon as they are conceivable. The 
argument takes the mathematical existence of mathematical objects for granted and aims 
at explaining the physical existence of physical objects. That is why it is not over after the 
first paragraph. 


2K OOK OK 


Objection: No statement about physical existence can follow from statements exclusively 
concerned with mathematics. That is a matter of elementary logic, like ‘ought’ cannot 
follow from ‘is’. 


Answer: That is true only as long as nothing is known about the meaning and the nature 
of physical existence. The hypothesis of our reasoning is that the physical existence of a 
space-time point x ultimately means the mathematical existence of a certain mathematical 
object O corresponding to x, and that the physical existence of matter at x ultimately means 
that O has the property M. If this is the case then statements about physical existence can 
clearly follow from mathematical statements. 


2K OK OK 


Objection: The i-explanation aims at explaining the existence of a physical world, of space- 
time, matter in motion, and all facts that supervene on matter in motion. But it does not 
lead to qualia (i.e. conscious experiences, such as experience of the colour red [3]). The 
mathians do not have qualia (i.e. they are not conscious, they do not experience the colour 
red). If they claim they do then they are mistaken. I know that I have qualia, so I cannot 
be a mathian. If the i-explanation were correct then qualia would not exist. Thus, the i- 
framework makes a prediction, the absence of qualia, that can be regarded as an empirical 
prediction and that our findings disagree with. 


Answer: I think that this objection is a good argument against the i-explanation. It may be 
tempting to hypothesise that qualia ultimately are mathematical properties, but that does 
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not fit with the nature of qualia: no mathematical structure would explain the way the colour 
red looks (see, e.g. Chalmers, 1996 [3]), so the nature of an experience of red cannot be 
mathematical. So I agree that the i-framework makes a prediction, the absence of qualia, 
that is empirically false. I conclude that the i-explanation, while it is a possible explanation 
of the existence of the physical world, is not the correct explanation. 
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Part II 


Structures in the Universe and the Structure 
of Modern Cosmology 


=) 
Some Generalities About Generality 


JOHN D. BARROW 


5.1 Introduction 


The equations of general relativity and its extensions are mathematically complicated 
and their general coordinate covariance offers special challenges to anyone seeking exact 
solutions or conducting numerical simulations. They are non-linear in a self-interacting 
(non-Abelian) way because the mediator of the gravitational interaction (the graviton) also 
feels the gravitational force. By contrast in an Abelian theory, like electromagnetism, the 
photon does not possess the electric charge that it mediates. As a result of this formidable 
complexity and non-linearity, the known exact solutions of general relativity have always 
possessed special properties. High symmetry, or some other simplifying mathematical 
property, is required if Einstein’s equations are to be solved exactly. General solutions 
are out of reach. 

This ‘generality’ problem has been a recurrent one in relativistic cosmology from the 
outset in 1916 when Einstein [1] first proposed a static spatially homogeneous and isotropic 
cosmological model with non-Euclidean spatial geometry in which gravitationally attrac- 
tive matter is counter-balanced by a positive cosmological constant. This solution turned 
out to be unstable [2-6]. Subsequently, the appearance of an apparent ‘beginning’ and 
‘end’ to simple expanding-universe solutions led to a long debate over whether these fea- 
tures were also unstable artefacts of high symmetry or special choices of matter in the 
known cosmological solutions, as Einstein thought possible. The quest to decide this issue 
culminated in a new definition of such ‘singularities’ which allowed precise theorems to 
be proved without the use of special symmetry assumptions. In fact, by using the geodesic 
equations, their proofs made no use of the Einstein equations [7, 8]. Special solutions of 
Einstein’s equations, like the famous Gédel metric [9] with its closed timelike curves, also 
provoked a series of technical studies of whether its time-travelling paths are a general fea- 
ture of solutions to Einstein’s equations, or just isolated unstable examples. In the period 
1967-1980 there was considerable interest in determining whether the observed isotropy 
of the microwave background radiation could be explained because it appeared to be an 
unstable property of expanding universes [10, 11]. The mechanism of ‘inflation’, first pro- 
posed in 1981 by Guth [12], provided a scenario in which this conclusion could be reversed, 
and isotropy could be a stable (or asymptotically stable) property of expanding-universe 
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solutions, by widening the range of conditions on the allowed forms of matter that could 
dominate the expansion dynamics of the very early universe [13-18]. Just to show how 
knowledge, fashion, and belief change, the requirements on the density, o, and pressure, 
p, of matter content needed for inflation to occur (p + 3p < 0) are exactly the opposite of 
those assumed (p + 3p > 0) in the principal singularity theorems of Penrose and Hawking 
[7, 8, 19] in order to establish sufficient conditions for a singularity (at least one incomplete 
geodesic) to have occurred in our past. 

In the study of differential equations, an exact solution is called stable if small per- 
turbations remain bounded as time increases; it is called asymptotically stable if the 
perturbations die away to zero with increasing time. Our solar system is dynamically sta- 
ble but not asymptotically stable. Another useful pair of definitions are those introduced by 
Hawking [20] in 1971, who uses the same word in a technically different way. He defines 
a ‘stable’ or ‘open’ property of a dynamical system to be one that occurs from an open set 
(rather than merely a single point) in initial data space. However, it is possible for a prop- 
erty of a cosmological model to be stable but be of no physical interest: it is a necessary 
but not a sufficient property for physical relevance because the property in question could 
be stable only in open neighbourhoods of initial data space describing universes with other 
highly unrealistic properties (contraction or extreme anisotropy, for example). A ‘generic’ 
or “open dense’ property will be one that occurs near almost every initial data set (that is, 
it is open dense on the space of all initial data). A sufficient condition for a stable property 
to be of physical interest is that it is generic in this sense [21]. 

In this chapter we will discuss approaches to the problem of assessing generality and 
some of the results that arise in typical and topical cosmological problems. We will try to 
avoid significant technicalities. There is a deliberate emphasis upon fundamental questions 
of interest to philosophers of science rather than upon the astrophysical complexities of the 
best-fit cosmological models or the galaxy of inflationary universe models. Attention will 
be focussed on classical general relativity; aspects of quantum cosmology will be treated 
in other chapters. 


5.2 General Relativistic and Newtonian Cosmology 


General relativity is a much larger theory than Newtonian gravity. It has ten symmetric 
metric potentials, gap, instead of one Newtonian gravitational potential, ®, and ten field 
equations (Einstein’s equations) instead of a single one (Poisson’s equation) to determine 
them from material content of space and time. Newtonian gravity has a fixed time and a 
fixed space geometry which is usually taken to be a monotonous linear time plus a three- 
dimensional (3D) Euclidean space (although another fixed curved space could be assumed 
simply by using the appropriate V7 operator in Poisson’s equation). 

Despite appearances, Newton’s theory is not really complete and Newtonian cosmology 
is not a well-posed theory [22, 23]. Unlike general relativity, it contains no propagation 
equations for the shear distortion and the formulation of anisotropic cosmological models 
requires these to be put in by hand. As a result the Newtonian description of an isotropic 
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and homogeneous cosmology looks exactly like general relativity [24] because these shear 
degrees of freedom are necessarily absent. This feature manifests itself in results for the 
general asymptotic behaviour of the Newtonian n-body problem in the unbound (expand- 
ing) case. Rigorous results can be obtained for the moment of inertia (or radius of gyration), 
or rotation of the total finite mass of n-bodies, but not for its shape [22, 25]. 

All solutions of Einstein’s equations describe entire universes. The relative sizes of the 
two theories mean that infinitely many of these general relativistic solutions possess no 
Newtonian counterpart. However, there are also Newtonian ‘universes’ which have no 
counterpart in general relativity. For example, there are shear-free Newtonian solutions 
with expansion and rotation: these cannot exist in general relativity [26]. More striking, 
there exist solutions of the Newtonian n-body problem (for n > 4) in which a system 
of point particles expands to infinite size in finite time, undergoing an infinite number of 
oscillations in the process [27]. For example, two counter-rotating binary pairs, all of equal 
mass, with a lighter particle oscillating between their centres along a line perpendicular to 
their orbital planes, can expand to infinite size as a result of an infinite number of recoils 
in a finite time! This is only possible because Newtonian point particles can get arbitrarily 
close to one another and so the 1/r? forces between them can become arbitrarily large. In 
general relativity, this cannot happen. When two point particles of mass M approach closer 
than 4GM/c? an event horizon forms around them. This is a simple example of a form 
of ‘cosmic censorship’ that saves us from the occurrence of an actual observable infinity, 
in Aristotle’s sense [28], locally. In general relativity there is evidence that under broad 
conditions there is a maximum force, equal to ct /G [29], as well as the more fundamental 
maximum velocity for information transfer, c: neither of these relativistic limits on velocity 
and force strength exist in Newtonian theory. 


5.3 Generality — Some Historic Cases 


There has been a succession of cosmological problems where particular solutions were 
found with striking properties that required further analysis to determine whether those 
properties were general features of cosmological solutions to the Einstein equations. 


5.3.1 Static Universes 


The first isotropic and homogeneous cosmological model found by Einstein [1] was a 
static universe with zero-pressure matter, positive cosmological constant and a positive 
curvature of space. Subsequently, this solution was shown to be unstable when it was per- 
turbed within the family of possible isotropic and homogeneous solutions of Einstein’s 
equations by Eddington and implicitly by Lemaitre [30], who found the general solutions 
of which Einstein’s universe was a particular, and clearly unstable, case. These demon- 
strations led to the immediate abandonment of the static universe and, in Einstein’s case, 
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of the cosmological constant as well [31].! It turns out that this stability problem is more 
complicated than it appears and has only been completely explored, when other forms of 
matter are present, quite recently. The static universe is only unstable against small inho- 
mogeneous perturbations on scales exceeding the Jeans length when p/p < 1/5. When 
1 > p/p > 1/5, the Jeans length for the inhomogeneities exceeds the size of the universe 
and so the instability does not become Jeans unstable and amplify in time [2-6]. 


5.3.2 Singularities 


There is a long and interesting history of attempts to interpret and avoid ‘singularities’ in 
the cosmological solutions of Einstein’s equations. In the first expanding solutions with 
zero-pressure matter found by Friedmann [32], it appeared that there was a necessary 
beginning to the expansion with infinite density at a finite time in the past and there could 
(in spatially closed cases) also be an apparent end to the universe at a finite time in the 
future. One response to these infinities, particularly by Einstein, was to question whether 
they would remain if the family of solutions was widened. First, Einstein asked whether 
the addition of pressure would resist the compression to infinite density. Lemaitre showed 
that adding pressure actually made the problem worse by hastening the appearance of the 
infinite density [33]. The reason is a relativistic one. Whereas in Newtonian physics any 
pressure resists gravitational compression, in relativity pressure also gravitates because it 
is a form of energy (and so has an equivalent mass via ‘E = mc~’) and increases the com- 
pression (see also Chandrasekhar and Miller, 1974 [34]). Next, Einstein wondered whether 
it was the perfect isotropy of the expanding-universe solutions that was responsible. If 
anisotropy was allowed then perhaps the compression would be defocused and the singu- 
larity avoided. Again, Lemaitre was easily able to show that simple anisotropic universes 
have the same types of singularity and they are approached quicker than in the isotropic 
case [33]. 

These investigations by Lemaitre amounted to tests of the stability of the singularity’s 
occurrence within different wider sets of initial data. Many cosmologists were convinced 
by these examples that singularities were ubiquitous in these types of cosmology unless 
new forms of matter could be found which resisted compression to infinite density. 
One such material was the C-field of the steady state theory, introduced by Hoyle to 
describe ‘continuous creation’ of matter in a de Sitter universe that avoided the high- 
density singularities of the Friedmann—Lemaitre models. However, all its null geodesics 
are past incomplete and so it is technically singular [35] (a feature that has been recently 
rediscovered in another context [36]). 


! Einstein supposedly said that introducing the cosmological constant was the biggest blunder of his life, pre- 
sumably because it meant that he failed to predict the expansion of the universe, but I can find no primary trace 
of such a remark. Einstein did make a similar remark about his signing of the letter to Roosevelt urging the 
construction of an atomic bomb (although he didn’t work on the Manhattan Project because he was judged to 
be a potential security risk). In 1954, he called his decision to sign ‘the one great mistake in my life’ [31] which 
suggests he never made such a remark about the cosmological constant. 
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Later, in the period 1957-1966, a different approach was pursued for a while by mem- 
bers of Landau’s school in Moscow, notably by Khalatnikov and Lifshitz [37]. Initially, 
they set out to show that singularities did not occur in the general solution of the Ein- 
stein equations. Their argument, which was incorrect, was that because singularities arose 
in solutions that were not general (like the Friedmann solutions or the anisotropic Kas- 
ner universes) they would appear in the general solution. The circumstantial evidence 
for this conclusion was the belief that the singularity arising in these cosmological solu- 
tions was just a singularity in the coordinate system used to describe the dynamics and so 
was unphysical (as is the ‘singularity’ that arises at the North Pole of the Earth where 
the meridians intersect in standard mapping coordinates). When it occurred you could 
change to a new set of coordinates until they too became singular (as they always did) 
and so on ad infinitum. Unfortunately, it is important to investigate what happens in the 
limit of this process: a true physical singularity remains — as became increasingly clear 
when the problem was subjected to a different sort of analysis. The singularity theorems 
of Hawking and Penrose [7, 8] were able to define sufficient conditions for the forma- 
tion of a singularity by adopting a definition of a singularity as an inextensible path of a 
particle or light-ray in spacetime. These theorems made no reference to special symme- 
tries or the subtleties of coordinate choices. Singularities were where time ran out: part 
of the edge of spacetime [38]. It remained to be shown that these endpoints were caused 
by infinities in physical quantities. To some extent this can be done but the full story is 
by no means complete, even now. These theorems ended the argument about whether 
cosmological singularities were physically real and general and later work by Belinskii, 
Khalatnikov and Lifshitz refocused upon finding the general behaviour near a physical 
singularity [39]. 

It is important to stress that the singularities are theorems, not theories. They give suf- 
ficient conditions for singularities so if their assumptions are not all met this does not 
mean that there is no inevitable singularity, merely that no conclusions can be drawn. The 
interesting historical aspect which we signalled in the introduction is that the sufficient 
conditions generally included the requirement that the matter content of the universe obeys 
p+ 3p > 0. We no longer believe this inequality holds for all matter sources. Indeed, the 
observations that the universe is accelerating could be claimed to show that the assumption 
that p + 3p > 0 is false. 


5.3.3 Isotropisation 


After the discovery in 1967 of the high level of isotropy in the CMB temperature 
distribution [40] there was a long effort to explain why this was the case. Up until then, cos- 
mologists had assumed an isotropic and homogeneous background universe and regarded 
the presence of small inhomogeneities (like galaxies) as the major mystery requiring a sim- 
ple explanation. The discovery of the CMB isotropy placed created a new perspective in 
which it was the high isotropy and uniformity of the assumed background universe that 
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was the major mystery. A new approach, proposed by Misner and dubbed ‘chaotic cosmol- 
ogy’ sought to show that general cosmological initial conditions would end up leading to 
an isotropically expanding universe after more than about 10 billion years [10, 11]. This 
programme had an interesting methodological aspect. If it could be shown that almost all 
initial conditions (subject to some weak conditions of physical reasonableness) would lead 
to an isotropic universe, then observations of the isotropy level could not tell us anything 
about the initial conditions: memory of them would have been erased by the expansion. 

In studying whether this idea could work it was again the issue of generality that 
was crucial. It was asking whether isotropically expanding universes were stable or even 
asymptotically stable attractors at late times. Two types of analysis were performed. The 
first just asked whether anisotropic cosmologies would approach isotropy at late times 
if they just contain zero-pressure matter and radiation. The second, which Misner pro- 
posed, was to ask what happened if dissipative stresses could arise because of the presence 
of collisionless particles, like neutrinos and gravitons, at particular epochs in the very 
early universe. Perhaps large initial anisotropies could be damped out by these dissipative 
processes, leaving an isotropically expanding universe? 

Several interesting approaches to these questions were developed. On the physical level 
Barrow and Matzner [41] showed that the chaotic cosmology philosophy could not work 
in general because the dissipation of anisotropies and inhomogeneities must produce 
heat radiation, in accord with the second law of thermodynamics. The earlier dissipation 
occurred the larger the entropy per baryon produced. So, the observed entropy per baryon 
today (of about 10°) placed a prohibitively strong bound on how much anisotropy could 
have been damped out during the history of the universe. The presence of particle horizons 
with proper radii proportional to fin the early universe also placed a major constraint on the 
damping of any large scale inhomogeneities by causal processes. Misner [42] attempted to 
circumvent this by discovering the remarkable possibility that spatially homogeneous uni- 
verses of Bianchi type [IX (dubbed the ‘Mixmaster’ universe because of this property) could 
potentially allow light to travel around the universe arbitrarily, often on approach to t = 0 
as a result of their chaotic dynamics. Unfortunately, this horizon removal mechanism was 
ineffective in practice because of the improbability of the horizonless dynamical configu- 
rations and the fact that only about 20 chaotic oscillations of the scale factor could have 
occurred between the Planck time (10~*s) and the present [43]. There was also the con- 
cern that if general relativistic cosmology was a well-behaved initial-value problem then 
one could always concoct anomalously anisotropic universes today that could be evolved 
back to their initial conditions at any arbitrary early time [44]. These would provide counter 
examples to the chaotic cosmology scheme, although one has to be careful with this argu- 
ment because the counter examples could all have physically impossible initial conditions, 
and in fact they often do [45]. 

In 1973 Collins and Hawking [46] carried out some interesting stability analyses of 
isotropic universes to discover if isotropy was a stable property of homogeneous initial 
data. The results found were widely discussed and reported at a semi-popular level but 
needed to be treated cautiously because of the fine detail in the theorems. They reported 
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that for ever-expanding universes isotropic expansion was ‘unstable’ but if attention was 
narrowed to spatially flat initial data with zero-pressure matter then isotropy was ‘stable’. 
The cosmological constant was assumed to be zero. The definition of stability used was 
in fact asymptotic stability and so the proof that isotropy was unstable just meant that 
anisotropies did not tend to zero as t > ov. In fact, closer analysis showed that in general 
o/H — constant in open universes (and it is impossible for o/H to grow asymptotically) 
with p + 3p > 0. This means that isotropy is stable, although not asymptotically stable 
[47, 48]. The other technicality is that this result is a consequence of the fact that these open 
universes become vacuum (or spatial curvature dominated) at late times. The behaviour of 
the anisotropy is therefore an asymptotic property of vacuum cosmologies and does not 
tell us anything about the past history of the universe at redshifts z > z-, where z. < O(1) 
is the redshift where the expansion becomes curvature dominated. These stability results 
therefore did not help us understand the sort of initial data that could give rise to high 
isotropy after about 10 billion years of expansion. 


5.3.4 Cosmic No-Hair Theorems 


Amid all this interest in explaining the isotropy of the universe there was one prescient 
approach by Hoyle and Narlikar [49] that predated the discovery of the high isotropy of 
the CMB. In 1963 they pointed out that in the standard big bang model the isotropy and 
average uniformity of the universe was a mystery but in the steady state universe it would 
be naturally explained. To support this claim they showed that the de Sitter universe is sta- 
ble against scalar, vector and tensor perturbations. Thus, a steady state universe, described 
by the de Sitter metric of general relativity would always display high isotropy and unifor- 
mity. In fact, had they only known it, they could have predicted that the only spectrum of 
perturbations consistent with the steady state universe is the constant curvature spectrum 
with constant (small) metric perturbations on all scales that is observed with high accu- 
racy today — although ironically via perturbations in the CMB whose existence the steady 
state model could not explain. Any departure from this spectrum, with 50/p « L~? on 
length scale L, would either create divergent metric potential perturbations as L > oo or 
L—> 0. 

The idea of the inflationary universe provided a new type of explanation for the observed 
isotropy and uniformity of the universe from general initial conditions, but with one impor- 
tant difference from past expectations — the isotropy and homogeneity was predicted to be 
local. The inflationary universe theory proposed that there was a finite interval of time, 
soon after the apparent beginning of the expansion (typically at ~ 107*>s in the original 
conception of the theory), when the expansion of the universe would accelerate due to the 
presence of very slowly evolving scalar fields of the sort that appeared in new theories 
of high-energy physics. These would contribute stresses with p + 3p < 0 and cause the 
expansion scale factor to accelerate. When this occurs the expansion rapidly approaches 
isotropy. Anisotropies fall off very rapidly and isotropic expansion is asymptotically sta- 
ble. In the most likely scenario, where p = —p, the expansion behaves temporarily like 
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Hoyle’s steady state universe and grows exponentially in time. However, the inflation needs 
to end, and this only happens if the scalar fields responsible decay into ordinary particles 
and radiation with p + 3p > 0. When this happens the usual decelerating expansion is 
resumed but with anisotropies so diminished in amplitude that they remain imperceptibly 
small at late times [13-18]. 

Inflation works by taking a small patch of the universe that is small enough for light 
signals to cross it at an early time fy and expanding it so dramatically (exponentially in 
time) that it grows larger than the entire visible universe today in the short period dur- 
ing which inflation occurs. Thus the isotropy and high uniformity of the visible universe 
today are a reflection of the fact that it is the expanded image of a region that was small 
enough to be coordinated by light-like transport processes and damping when inflation 
occurred. If inflation had not occurred, and the expansion had merely continued along its 
standard decelerating trajectory then the initially smooth and isotropic region would not 
have expanded significantly by the present time, fo. Here is a simple calculation of how 
this happens. 

Suppose the preset temperature of the CMB is 79 = 3K and when inflation occurred it 
was T; = 3 x 10°8K. Then, since T « a, the scale factor has increased by a factor of 
T,/To = 10°8. At time t;, the horizon size is equal to d(t;) = 2ct; where t; ~ 10-8, so 
a horizon-sized region of size 2ct; at t; would only have expanded to a size 2 x 3 x 10! 
cm s~! x 1077s x 1078 = 6 x 10° cm by the present day. This is not of any relevance 
for explaining isotropy and uniformity over scales of order cfg ~ 10°° cm today. However, 
suppose inflation occurs at 7 and inflates the expansion scale factor by a factor of e”. We 
will now be able to enlarge the causally connected region of size 2ct; up to a scale of 
e’ x 6 x 10° cm. This will exceed the size of the visible universe today if eY x 6 x 10° > 
1078. This is easily possible with N > 60. If the expansion is exponential with a(t) « e/” 
then we only need inflation to last from about 10~*° s until 10~*° s in order to effect 
this. The regularity of the universe is therefore explained without any dissipation taking 
place. A very tiny smooth patch is simply expanded to such an extent that its smooth and 
isotropic character is reflected on the scale of the entire universe today. It is very likely 
(just as it is more likely that a randomly chosen positive integer will be a very large one) 
that the amount of inflation that occurred will be much larger than 60 e-folds. Yet, the 
result is to predict that the universe will be uniform on the average out to the inflated scale 
e’ x 6 x 10° but may be rather non-uniform if we could see further. In some variants of 
the theory many other fundamental features of the universe (values of constants of Nature, 
space dimensions, laws of physics) are different beyond the inflated scale as well. While 
there have always been overly positivistic philosophers who have cautioned against simply 
assuming that the unobserved part of the (possibly infinite) universe is the same on average 
as the observed part, this is the first time there has been a positive prediction that we should 
not expect them to be the same. 

The result of a sufficiently long period of accelerated expansion is to drive the local 
expansion dynamics of the universe towards the isotropic de Sitter expansion, with 
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asymptotic form of the metric at large time, f, of the form (a, 8 = 1, 2,3) [18] 


ds? = dt? — gapdx* dx, 
Sap = exp[2Ht]dap (x) + bag (x) + exp[—HAt]cag(x) +--- , 


where H is the constant Hubble rate, with 3H? = A, and dup (X), Dap (X) and Cop (x) are 
arbitrary symmetric spatial functions. The Einstein equations allow only two of the agg and 
two of the Cg to be freely specifiable and all the bag are determined by them. Thus there 
are four independently arbitrary spatial functions specifying the solution on a spacelike 
surface of constant f in vacuum. Notice the spatial functions ayg(x) at leading order in 
the metric (H is a constant though). This is why the metric only approaches de Sitter 
locally, exponentially rapidly inside the event horizon of a geodesically moving observer. 
If black body radiation (p = ¢/3) is added then a further four arbitrary spatial functions are 
required (three for the normalised four-velocity components and one for the density) and 


p « exp[—4H7], 
ug > 1,uq « explHt]ch.g, 


co = 0. 

Hence, we see that the three-velocity V7 = uu” tends to a constant as t > oo. The asymp- 
totic state is therefore de Sitter plus a constant (or ‘tilted’) velocity field which affects the 
metric at third order (via Cyg(x)). This is easy to understand physically if we consider a 
large rotating eddy that expands with the universe and has angular velocity @ = Vaq!. 
Its angular momentum is Maw « (pa>)a?(Va~!) and this is conserved as the universe 
expands. Since the radiation density falls as p « a~# 

The number of free spatial functions specifying this asymptotic solution is eight in the 
case with radiation (and the same holds when any other perfect fluid matter is present). In 
the next section we will show that this is characteristic of a part of the general solution of 
Einstein’s equations. 

In conclusion we see that a finite period of accelerated expansion is able to drive the 
expansion towards isotropy from a very large class of initial conditions (not all initial con- 
ditions, since the universe must not, for example, recollapse before a period of accelerated 
expansion begins). We have just discussed the most extreme form of accelerated expansion 
with constant Hubble expansion rate, H, and a « exp[H?] but similar conclusions hold for 
power-law inflation, with a « t’,n > 1, and intermediate inflation, with a « exp[A?”], 
with A > 0, and 0 < n < | constants. The key conceptual point is that inflation explains 
the present isotropy without dissipating initial anisotropies in the way that the chaotic cos- 
mology programme imagined and so it evades the Barrow—Matzner entropy per baryon 
constraint [41]. Instead, it drives the initial inhomogeneities far beyond the visible horizon 
today and the stress driving the acceleration dominates over all forms of anisotropy at large 


, we have V constant as a > oo. 
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expansion volumes and times. The earlier analyses of the stability of isotropic expansion 
by Collins and Hawking, and others [46] had restricted attention to forms of matter in the 
universe with p + 3p > O and always assumed A = 0 because there was no reason to 
think otherwise at that time. As a result, they had excluded the possibility of accelerated 
expansion which can solve the isotropy problem without any dissipation occurring if it can 
arise for a finite period of time in the early universe. 


5.3.5 The Initial Value Problem 


The attempts to explain the isotropy of the universe from arbitrary initial conditions gave 
rise to another interesting perspective that is worth highlighting. General relativity is an 
initial value problem and so for ‘well-behaved’ cosmological solutions this means that 
the present state of the universe described by any solution of Einstein’s equations is a 
continuous function of some ‘initial data’ at any past time. In a technical sense it might 
appear that given any state of the universe today — highly anisotropic, for example — then 
there exists some initial data set that evolves to give that state regardless of the action of any 
damping effects. Hence, there could never be a theory that could explain the actual state of 
the universe today as the result of evolution from any (or almost any) initial conditions. The 
problem with this argument is that the initial conditions that do evolve to counter-factual 
cosmological states at late times may arise only from initial data states that are completely 
unphysical in some respect [45]. Take a simple example of a Bianchi type I anisotropic 
universe. The anisotropy energy density and radiation energy density fall as 


= op (1 +2z)°, 


Py = py +2)". 


We can choose values of the constant Ge so that the universe’s expansion is dominated 
by anisotropy today — just pick On = pyo = 10-34 gm cm~? to specify the present data. 
However, if we run this apparent counter-example back to the time when the radiation 
temperature is T,; ~ 10°2K = 3(1 + Zpi) when its energy density equals the Planck density, 
10°* gm cm~?, we require the anisotropy energy density to be 10% times larger than the 
Planck energy density at that time — a completely unphysical situation. Alternatively, if we 
had taken the anisotropy energy density to be the Planck density at z,; then we have the 
strange initial condition that the radiation density is 10° times smaller despite all forms of 
energy being in quantum gravitational interaction at that time. 

This is a (deliberately) dramatic example but the basic problem with the argument is one 
that we can find with other arguments regarding the generality of more complicated out- 
comes in cosmology. For example, there have been claims (and claims to the contrary) that 
inflation is not generic for Friedmann universes containing scalar fields with a quadratic 
self-interaction potential [50-52]. The claim is based on using the Hamiltonian measure in 
the phase space for the dynamics to show that the bulk of the initial data measure is for 
solutions which do not inflate. This type of initial data corresponds to solutions with a huge 
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initial kinetic term (p?) which dominate the potential V(¢@) = mp by a large factor so 
that the potential never comes to dominate the dynamics by any pre-specified epoch. How- 
ever, this does not look very natural because it requires the two forms of energy density to 
differ by an enormous factor when one of them equals the Planck energy density (above 
which we know nothing about what happens since general relativity, quantum mechanics 
and statistical mechanics all break down). The better course is not to let any energy density 
exceed the Planck value but (surprisingly) this appears to be controversial. 


5.4 Naive Function Counting 


There have been several attempts to reduce the description of the astronomical universe 
to the determination of a small number of measurable parameters. Typically, these will be 
the free parameters of a well defined cosmological model that uses the smallest number of 
constants that can provide a best fit to the available observational evidence. Specific exam- 
ples are the popular characterisations of cosmology as a search for ‘nine numbers’ [53], 
“six numbers’ [54], or the six-parameter minimal ACDM model used to fit the WMAP 
[55] and Planck data sets [56]. In all these, and other, cases of simple parameter counting 
there are usually many simplifying assumptions that amount to ignoring other parameters 
or setting them to zero; for example, by assuming a flat Friedmann background universe or 
a power-law variation of density inhomogeneity in order to reduce the parameter count and 
any associated degeneracies. The assumption of a power-law spectrum for inhomogeneities 
will reduce a spatial function to two constants, while the assumption that the universe is 
described by a Friedmann metric plus small inhomogeneous perturbations both reduces the 
number of metric unknowns and converts functions into constant parameters. In what fol- 
lows we are going to provide some context for the common minimal parameter counts cited 
above by determining the total number of spatial functions that are needed to prescribe the 
structure of the universe if it is assumed to contain a finite number of simple matter fields. 
We are not counting fundamental constants of physics, like the Newtonian gravitation con- 
stant, the coupling constants defining quadratic lagrangian extensions of general relativistic 
gravity, or the 19 free parameters that define the behaviours of the 61 elementary particles 
in the standard three-generation U(1) x SU(2) x SU(3) model of particle physics. However, 
there is some ambiguity in the status in some quantities. For example, as to whether the 
dark energy is equivalent to a true cosmological constant (a fundamental constant), or to 
some effective fluid or scalar field, or some other emergent effect [57]. Some fundamental 
physics parameters, like neutrino masses, particle lifetimes, or axion phases, can also play 
a part in determining cosmological densities but that is a secondary use of the cosmological 
observable. Here we will take an elementary approach that counts the number of arbitrary 
functions needed to specify the general solution of the Einstein equations (and its generali- 
sations). This will give a minimalist characterisation that can be augmented by adding any 
number of additional fields in a straightforward way. We will also consider the count in 
higher-order gravity theories as well as for general relativistic cosmologies. We enumerate 
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the situation in spatially homogeneous universes in detail so as to highlight the significant 
impact of their spatial topology on evaluations of their relative generality. 

Let us move on to a more formal discussion of how to specify the generality of solu- 
tions to Einstein’s equations by counting the number of free functions (or constants) that a 
given solution (or approximate solution) contains. In view of the constraint equations and 
coordinate covariances of the theory this requires a careful accounting. 

The cosmological problem can be formulated in general relativity using a metric in a 
general synchronous reference system [58]. Assume that there are F matter fields which 
are non-interacting and each behaves as a perfect fluid with some equation of state p;(p;), 
i = 1,...F. They will each have a normalised four-velocity field, (ug)j, a = 0,1, 2,3. 
These will in general be different and non-comoving. Thus each matter field is defined on 
a spacelike surface of constant time by four arbitrary functions of three spatial variables, 
x* since the uo components are determined by the normalisations (ugu“); = 1. This means 
that the initial data for the F non-interacting fluids are specified by 4F functions of three 
spatial variables. If we were in an N-dimensional space then each fluid would require 
N + 1 functions of N spatial variables and F fluids would require (N + 1)F such functions 
to describe them in general. 

The 3D metric requires the specification of 6 gag and 6 gyg for the symmetric spatial 
3 x 3 metric in the synchronous system but these may be reduced by using the four coor- 
dinate covariances of the theory and a further four can be eliminated by using the four 
constraint equations of general relativity. This leaves four independently arbitrary func- 
tions of three spatial variables [58] which is just twice the number of degrees of freedom of 
the gravitational spin-2 field. The general transformation between synchronous coordinate 
systems maintains this number of functions [58]. This is the number required to specify 
the general vacuum solution of the Einstein equations in a three -dimensional (3D) space. 
In an N-dimensional space we would require N(N + 1) functions of N spatial variables 
to specify the present data for ggg and gag. This could be reduced by N + 1 coordinate 
covariances and N + | constraints to leave (VN — 2)(N + 1) independent arbitrary functions 
of N variables [59]. This even number is equal to twice the number of degrees of freedom 
of the gravitational spin-2 field in N + 1 dimensional spacetime. 

When we combine these counts we see that the general solution in the synchronous 
system for a general relativistic cosmological model containing F fluids requires the spec- 
ification of (VN — 2)(N+ 1) +F(V+ 1) = (N+ 1)(V + F — 2) independent functions of NV 
spatial variables. If there are also S non-interacting scalar fields, ¢;, j = 1,...,S, present 
with self-interaction potentials V(¢;) then two further spatial functions are required (; and 
om) to specify each scalar field and the total becomes (N + 1)(N + F — 2) + 2S. For the 
observationally relevant case of N = 3, this reduces to 4(F + 1) + 2S spatial functions. 

For example, if we assume a simple realistic scenario in which the universe contains 
separate baryonic, cold dark matter, photon, neutrino and dark energy fluids, all with sep- 
arate non-comoving velocity fields, but no scalar fields, then F = 5 and our cosmology 
needs 24 spatial functions in the general case. If the dark energy is not a fluid, but a cosmo- 
logical constant with constant density and uj; = 5, then the dark energy ‘fluid’ description 
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reduces to the specification of a single constant, ppg = A/8zG, rather than four functions 
and reduces the total to 21 independent spatial functions. However, if the cosmological 
constant is an evolving scalar field then we would have F = 4 and S = 1, and now 22 
spatial functions are required. Examples of full-function asymptotic solutions were found 
for perturbations around de Sitter space-time by Starobinsky [18], the approach to ‘sud- 
den’ finite-time singularities [60, 61] by Barrow, Cotsakis and Tsokaros [62], and near 
quasi-isotropic singularities with p > p ‘fluids’ by Heinzle and Sandin [63]. 

These function counts of 21—24 should be regarded as lower bounds. They do not include 
the possibility of a cosmological magnetic field or some other unknown matter fields. They 
also treat all light (<< 1MeV) neutrinos as if they are identical (heavy neutrinos can 
be regarded as CDM if they provide the largest contribution to the matter density but if 
they are not responsible for the dominant dark matter then they should be counted as a 
further contribution to F). If there are matter fields which are not simple fluids with p(p) 
— for example an imperfect fluid possessing a bulk viscosity or a gas of free particles with 
anisotropic pressures — then additional parameters are required to specify them. There can 
still be overall constraints — a trace-free energy-momentum tensor, for example, in the cases 
of electric and magnetic fields or Yang—Mills fields — and we would just count the number 
of independent terms in the total energy-momentum tensor [64]. 

In the case of the Planck or WMAP mission data analyses, six constants are chosen to 
define the standard (minimal) ACDM model. For WMAP [55], these are the present-day 
Hubble expansion rate, Ho, the densities of baryons and cold dark matter, the optical depth, 
T, at a fixed redshift, and the amplitude and slope of an assumed power-law spectrum 
of curvature inhomogeneities on a specified reference length scale. This is equivalent to 
including three matter fields (radiation, baryons, cold dark matter) but the standard ACDM 
assumes zero spatial curvature, k, ab initio, so a relaxation of this would add a curvature 
term or a dark energy field, because when k ¥ 0 the latter could no longer be deduced from 
the other densities and the critical density (defined by Ho). The light neutrino densities 
are assumed to be calculable from the radiation density using the standard cosmological 
thermal history, so there are effectively F = 5 matter fields (with k set to zero in the 
base model and a metric time derivative determined by H). All deviations from isotropy 
and homogeneity enter only at the level of perturbation theory and are characterised by 
the spectral amplitude and slope on large scales; the amplitude on small scales (‘acoustic 
peaks’ in the power spectrum) is determined from that on large scales by an e~** damping 
factor determined by the optical depth parameter t. The Planck mission’s parameter choice 
is equivalent to this [56]. 

Although a general solution of the Einstein equations requires the full complement of 
arbitrary functions, different parts of the general solution space can have behaviours of 
quite different complexity. For example, when N < 9 there are homogeneous vacuum 
universes which are dynamically chaotic [39, 65, 66] but the chaotic behaviour disappears 
when N > 10 even though the number of arbitrary constants remains maximal for each 
N [67]. Hence, the dynamical complexity can fail to be captured by the function-counting 
approach. 
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5.4.1 Einstein’s ‘strength’ 


As an interesting historical aside, we should mention Einstein’s attempt to study the power 
of mathematical formalisms to describe physical theories by ascribing to them a numerical 
measure of their predictive power, which he called the ‘strength’ of a system of differential 
equations. It was to be measured by the number of free pieces of initial data needed to 
determine the general solutions of the equations. Einstein believed that “The smaller the 
number of free data consistent with the system of field equations, the ‘stronger’ is the sys- 
tem. It is clear that in the absence of any other viewpoint from which to select the equations, 
one will prefer a “stronger” system to a less strong one’ [68]. This was the method Einstein 
proposed to follow in his quest for a unified field theory (how different to the methodology 
that led to all his past great successes). The enumeration of the strength of a system of 
equations for d variables began by expanding an analytic function of these variables in a 
Taylor series about a point and noting that at n‘” order the total number of terms in the 
expansion is 
n+d—l1\_ (n+d-—1)! 
( ) ~ ald —D!* 

If there are field equations which ensure that when the function is specified arbitrarily on 
a d — | dimensional (spatial) surface then those in the remaining (temporal) dimension are 
d+1 


n 


determined by them, and only ( ) of the Taylor series coefficients remain arbitrary. 


The fraction of coefficients that remain free is therefore 


eer: 


d nbd = 1" 
n 


In the case of Einstein’s equations we have coordinate covariances and constraint 
equations to use to reduce the count of free functions. The resulting strength turns out 
to be identical to the count of independent pieces of initial data for the metric and its first 
derivative that we have just described, giving a strength of four in vacuum. A similar count 
can be done for Maxwell’s equations (which have the same strength), or other equations 
of mathematical physics. A fuller discussion is given in Schutz (1975) [69] and Mariwalla 
(1974) [70]. 


5.5 More General Gravity Theories 


There has been considerable interest in trying to explain the dark energy as a feature of 
a higher-order gravitational theory that extends the lagrangian of general relativity in a 
non-linear fashion [71-74]. This offers the possibility of introducing a lagrangian that is 
a function L = f (R, RapR®”) of the scalar curvature R and/or the Ricci scalar RapR@ in 
anisotropic models, with the property that it contributes a slowly varying dark energy-like 
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behaviour at late times without the need to specify an explicit cosmological constant. How- 
ever, these higher-order lagrangian theories (excluding the Lovelock lagrangians in which 
the variation of the higher-order terms contribute pure divergences [75] and so the field 
equations are always second order in any spatial dimension) all have fourth-order field 
equations in 3D space when f # A + BR, with A, B constants. This means that the ini- 
tial data set for such theories is considerably enlarged because we must specify 2a, and 
Sap in addition to ggg and gag. In N space dimensions, this results in a further N(N + 1) 
function of N variables and so a general cosmological model with F fluids and S' scalar 
fields requires a specification of 2(N? — 1) + F(N +1) + 28S = (N+ I(F + 2N — 2) 
+2S independent arbitrary spatial functions. For N = 3, this is 16 + 4F + 2S. Gen- 
eral relativity with four matter fields plus a cosmological constant requires 20 spatial 
functions plus one constant, in general, whereas a higher-order gravity theory with four 
matter fields and no scalar fields (and no cosmological constant because it should pre- 
sumably emerge from the metric behaviour) requires the specification of 32 spatial 
functions. 


5.6 Reducing Functions to Constants 


The commonest simplification used to reduce the size of the cosmological characterisation 
problem is to turn the spatial functions into constants. This simplification will be exact if 
the universe is assumed to be spatially homogeneous. The set of possible spatially homo- 
geneous and isotropic universes with natural topology is based upon the classification of 
homogeneous three-spaces created by Bianchi [76-79] (together with the exceptional case 
of Kantowski-Sachs—Kompanyeets—Chernov with S' x S$? topology [80, 81], which we 
will ignore here as it displays non-generic behaviour). 

The most general Bianchi-type universes are those of types VJ), VII), VIII and IX. 
Of these, only types V//;, and IX, respectively, contain open and closed isotropic Fried- 
mann subcases. These most general Bianchi types are all defined by four arbitrary 
constants in vacuum plus a further four for each non-interacting perfect fluid source. 
Therefore, in 3D spaces, the most general spatially homogeneous universes containing 
F fluids are defined by 4(1 + F) arbitrary constants. This suggests that they might 
be the leading order term in a linearisation of the general inhomogeneous solution in 
the homogeneous limit. However, things might not be so simple. The four-function 
space of solutions to Einstein’s models like type [X with compact spaces has a coni- 
cal structure at points with Killing vectors and so linearisation about the points must 
control an infinite number of spurious linearisations (associated with all the tangents 
that can be drawn through the point of the cone but do not run down the side of the 
cone) that are not the leading-order terms in any convergent series expansion of a true 
solution [21, 82]. 

The Bianchi classification of spatially homogeneous universes derives from the classi- 
fication of the group of isometries with 3D subgroups that act simply transitively on the 
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manifold. Intuitively, these give cosmological histories that look the same to observers in 
different places on the same hypersurface of constant time. 

The Bianchi types are subdivided into two classes [83]: Class A contains types /(1 + 
F),1(2 + 3F), Vio(3 + 4F), Vio(3 + 4F), VII(4 + 4F) and IX(4 + 4F), while Class 
B contains types V(1 + 4F),/V(3 + 4F),11(3 + 4F), VI-1/9(4 + 3F), Vin(4 + 4F) and 
VII, (4 + 4F). The brackets following each Roman numeral of the Bianchi-type geometry 
contain the number of constants defining the general solution when F non-interacting per- 
fect fluids, each with p > —p, are present, so F = 0 defines the vacuum case. For example, 
Bianchi type I denoted by /(1+F) is defined by one constant in vacuum (when it is the Kas- 
ner metric) and one additional constant for the value of the density when each matter field 
is added. For simplicity, we have ignored scalar fields here, but to include them simply add 
2S inside each pair of brackets. The Euclidean metric geometry in the type J case requires 
Roa = 0, identically, and so the three non-comoving velocities (and hence any possible 
vorticity) must be identically zero. This contains the zero-curvature Friedmann model as 
the isotropic (zero parameter) special case. In the next simplest case, of type V, the general 
vacuum solution was found by Saunders [84] and contains one parameter, but each addi- 
tional perfect-fluid adds four parameters because it requires specification of a density and 
three non-zero us components. The spatial geometry is a Lobachevsky space of constant 
negative isotropic curvature. The isotropic subcases of type V are the zero-parameter Milne 
universe in vacuum and the F-parameter open Friedmann universe containing F fluids. 

In practice, one cannot find exact homogeneous general solutions containing the max- 
imal number of arbitrary constants because they are too complicated mathematically, 
although the qualitative behaviours are fairly well understood, and many explorations of 
the observational effects use the simplest Bianchi J or V models (usually without including 
non-comoving velocities) because they possess isotropic three-curvature and add only a 
simple fast-decaying anisotropy term (requiring one new constant parameter) to the Fried- 
mann equation. The most general anisotropic metrics which contain isotropic special cases, 
of types VII and IX, possess both expansion anisotropy (shear) and anisotropic three- 
curvature. Their shear falls off more slowly (logarithmically in time during the radiation 
era) and the observational bounds on it are much weaker [35, 46, 85-88]. 


5.7 Some Effects of the Topology of the Universe 


So far, we have assumed that the cosmological models in question have the ‘natural’ topol- 
ogy, that is R? for the 3D flat and negatively curved spaces and S* for the closed spaces. 
However, compact topologies can also be imposed upon flat and open universes to make 
their spatial volumes finite and there has been considerable interest in this possibility and 
its observational consequences for optical images of galaxies and the CMB [89-91]. 

The classification of compact negatively curved spaces is a challenging mathematical 
problem. When compact spatial topologies are imposed on spatially flat and open homo- 
geneous cosmologies, this produces a major change in their relative generalities and the 
numbers of constants needed to specify them in general. 
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The most notable consequences of a compact topology on 3D homogeneous spaces is 
that the Bianchi universes of types JV and VJ; no longer exist at all and open universes 
of Bianchi types V and V/I;, must be isotropic with spaces that are quotients of a space of 
constant negative curvature, as required by Mostow’s Rigidity theorem [92-96]. The only 
universes with non-trivial structure that differs from that of their universal covering spaces 
are those of Bianchi types J, IT, IIT, VIo, VI[y and VIIT. The numbers of parameters needed 
to determine their general cosmological solutions when F non-interacting fluids are present 
and the spatial geometry is compact are now given by /(10 + F), 11(6 + 3F), HI(2+ Ni, + 
F), VIo(4+4F), Vilo(8 +4F) and VIIT(4+Nm+4F), again with F = 0 giving the vacuum 
case, as before, and an addition of 2S to each prescription if S scalar fields are included. 
Here, N,, is the number of moduli degrees of freedom which measures the complexity of 
the allowed topology, with N,, = 6g + 2k — 6 = N', — 2g, where g is the genus and k is 
the number of conical singularities of the underlying orbifold [94, 95]. It can be arbitrarily 
large. 

The rigidity restriction that compact types V and V//;, must be isotropic means that 
compactness creates general parameter dependencies of V(F) and VII;,(F) which are the 
same as those for the open isotropic Friedmann universe, or the Milne universe in vacuum 
when F = 0. 

The resulting classification is shown in Table 5.1 [97]. We see that the introduction of 
compact topology for the simplest Bianchi type J spaces produces a dramatic increase in 
relative generality. Indeed, they become the most general vacuum models by the parameter- 
counting criterion. An additional nine parameters are required to describe the compact type 
I universe compared to the case with non-compact Euclidean R* topology. The reason for 
this increase is that at any time the compact three-torus topology requires three identifi- 
cation scales in orthogonal directions to define the torus and three angles to specify the 
directions of the vectors generating this lattice plus all their time-derivatives. This gives 
12 parameters, of which two can be removed using a time translation and the single non- 
trivial Einstein constraint equation, leaving ten in vacuum compared to the one required in 
the non-compact Kasner vacuum case. 

The following general points are worth noting: 


1. The imposition of a compact topology changes the relative generalities of homogeneous 
cosmologies. 

2. The compact flat universes are more general in the parameter-counting sense than the 
open or closed ones. 

3. Type VIII universes, which do not contain Friedmann special cases but can in principle 
become arbitrarily close to isotropy are the most general compact universes. 


The most general case that contains an isotropic special case is that of type V//o — recall 
that the V/J;, metrics are forced to be isotropic so open Friedmann universes now become 
asymptotically stable [94] and approach the Milne metric, whereas in the non-compact 
case they are merely stable and approach a family of anisotropic vacuum plane waves 
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Table 5.1. The number of independent arbitrary constants required to 
prescribe the general 3D spatially homogeneous Bianchi-type universes 
containing F perfect fluid matter sources in cases with non-compact and 
compact spatial topologies. The vacuum cases arise when F = 0. If S 
scalar fields are also present then each parameter count increases by 28S. 
The type IX universe does not admit a non-compact geometry and 
compact universes of Bianchi types IV and VIp, do not exist. Types III and 
VIII have potentially unlimited topological complexity and arbitrarily 
large numbers of defining constants parameters through the unbounded 
topological parameters Ny = 6g + 2k — 6 and Nj, = Nm + 2g, where g is 
the genus and k is the number of conical singularities of the underlying 


orbifold [97]. 
: No. of defining parameters with F non-interacting fluids 
Cosmological 
Bianchi type | Non-compact topology Compact topology 
I 1+F 10+F 
I 2+ 3F 6+ 3F 
VIo 3+4F 4+4F 
VI 3+4F 8+ 4F 
VIIT 4+ 4F 4+.Nm+4F 
IX _ 4+4F 
Il 3+4F 24+M,4+F 
IV 3+4F — 
V 1+4F F 
VIp, 4+4F - 
VIIp 4+4F F 


[47]. This peculiar hierarchy of generality should be seen as a reflection of how diffi- 
cult it is to create compact homogeneous spaces supporting these homogeneous groups of 
motions. 


5.8 Inhomogeneity 


The addition of inhomogeneity turns the constants defining the cosmological problem into 
functions of three space variables. For example, we are familiar with the linearised solu- 
tions for small density perturbations of a Friedmann universe with natural topology which 
produces two functions of space that control temporally growing and decaying modes. 
The function of space in front of the growing mode is typically written as a power-law 
in length scale (or wave number) and so has arbitrary amplitude and power index (both 
usually assumed to be scale-independent constants to first or second order) which can be 
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fitted to observations. Clearly there is no limit to the number of parameters that could 
be introduced to characterise the density inhomogeneity function by means of a series 
expansion around the homogeneous model (and the same could be done for any vortical 
or gravitational-wave perturbation modes) but the field equations would leave only eight 
independent functions. Further analysis of the function characterising the radiation den- 
sity is seen in the attempts to measure and calculate the deviation of its statistics from 
Gaussianity [98] and to reconstruct the past light-cone structure of the universe [99]. Any 
different choice of specific spatial functions to characterise inhomogeneity in densities or 
gravitational waves requires some theoretical motivation. What happens in the inhomoge- 
neous case if open or flat universes are given compact spatial topologies is not known. As 
we have just seen, the effects of topology on the spatially homogeneous anisotropic mod- 
els was considerable whereas the effects on the overall evolution of isotropic models (as 
opposed to the effects on image optics) is insignificant. It is generally just assumed that 
realistically inhomogeneous universes with non-positive curvature (or curvature of varying 
sign) can be endowed with a compact topology and, if so, this places no constraints on 
their dynamics. However, both assumptions would be untrue for homogeneous universes 
and would necessarily fail for inhomogeneous ones in the homogeneous limit. It remains to 
be determined what topological constraints arise in the inhomogeneous cases. They could 
be weaker because inhomogeneous anisotropies can be local (far smaller in scale than 
the topological identifications) or they could be globally constrained like homogeneous 
anisotropies. Newtonian intuitions can be dangerous because compactification of a New- 
tonian Euclidean cosmological space seems simple but if we integrate Poisson’s equation 
over the compact spatial volume we see that the total mass of matter must be zero. This 
follows from Poisson’s equation since 


0= [ vou = arc | pdV = 4nGM, 
V V 


where V is the compact spatial volume, M the total mass, and ® is the Newtonian 
gravitational potential. 

In practice, there is a divide between the complexity of inhomogeneity in the universe 
on small and large scales. On large scales there has been effectively no processing of the 
primordial spectrum of inhomogeneity by damping or non-linear evolution. Its description 
is well approximated by replacing a smooth function by a power-law defined by two con- 
stants, as for the microwave background temperature fluctuation spectrum or the two-point 
correlation function of galaxy clustering. Here, the defining functions may be replaced by 
statistical distributions for specific features, like peaks or voids in the density distribution. 
On small scales, inhomogeneities that entered the horizon during the radiation era can be 
damped out by photon viscosity or diffusion and may leave distortions in the background 
radiation spectrum as witness to their earlier existence. The baryon distribution may pro- 
vide baryon acoustic oscillations which yield potentially sensitive information about the 
baryon density [55, 56]. On smaller scales that enter the horizon later, where damping and 
non-linear self-interaction have occurred, the resulting distributions of luminous and dark 
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matter are more complicated. However, they are correspondingly more difficult to predict 
in detail and numerical simulations of ensembles of models are used to make predictions 
down to the limit of reliable resolution. Predicting their forms also requires a significant 
extension of the simple, purely cosmological enumeration of free functions that we have 
discussed so far. Detailed physical interactions, 3D hydrodynamics, turbulence, shocks, 
protogalaxy shapes, magnetic fields, and collision orientations, all introduce additional 
factors that may increase the parameters on which observable outcomes depend. The so- 
called bias parameter, equal to the ratio of luminous matter density to the total density, is 
in reality a spatial function that is being used to follow the ratio of two densities because 
one (the dark matter) is expected to be far more smoothly distributed than the other. All 
these small scale factors combine to determine the output distribution of the baryonic and 
non-baryonic density distributions and their associated velocities. 


5.8.1 Links to Observables 


The free spatial functions (or constants) specifying inhomogeneous (homogeneous) met- 
rics have simple physical interpretations. In the most general cases the four vacuum 
parameters can be thought of as giving two shear modes (i.e. time-derivatives of metric 
anisotropies) and two parts of the anisotropic spatial curvature (composed of ratios and 
products of metric functions). In the simplest vacuum models of type 7 and V the three- 
curvature is isotropic and there is only one shear parameter. It describes the allowed metric 
shear and in the type V model a second parameter is the isotropic three-curvature (which is 
zero in type J) — just like & in the Friedmann universe models. When matter is added there 
is always a single p (or p) for each perfect fluid and up to three non-comoving fluid veloc- 
ity components. If the fluid is comoving, as in type J only the density parameter is required 
for each fluid; in type V there can also be three non-comoving velocities. The additional 
parameters control the expansion shear anisotropy, anisotropic three-curvature. They may 
all contribute to temperature anisotropy in the CMB radiation but the observed anisotropy 
is determined by an integral down the past null cone over the shear (effectively the shear 
to Hubble rate ratio at last scattering of the CMB), rather than the Wey] curvature modes 
driven by the curvature anisotropy (which can be oscillatory [100], and so can be periodi- 
cally very small even though the envelope is large), while the velocities contribute dipole 
variations. Thus, it is difficult to extract complete information about all the anisotropies 
from observations of the lower multipoles of the CMB alone in the most general cases 
[99, 101, 102]. 

At present, the observational focus is upon testing the simplest possible ACDM model, 
defined by the smallest number (six) of parameters. As observational sensitivity increases it 
will become possible to place specific bounds or make determinations of the full spectrum 
of defining functions (or constants), or at least to confirm that they remain undetectably 
small as inflation would lead us to expect. In an inflationary model they can be iden- 
tified with the spatial functions defining the asymptotic expansion around the de Sitter 
metric [18]. 
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There have also been interesting studies of the observational information needed to 
determine the structure of our past null cone rather than constant-time hypersurfaces in 
the Universe [103], extending earlier investigations of the links between observables and 
general metric expansions by McCrea [104] and by Kristian and Sachs [105]. 

The high level of isotropy in the visible universe, possibly present as a consequence of 
a period of inflation in the early universe [12], or special initial conditions [45, 63, 107- 
110], is what allows several of the defining functions of a generic cosmological model 
to be ignored on the grounds that they are too small to be detected with current technol- 
ogy. An inflationary theory of the chaotic or eternal variety, in which inflation only ends 
locally, will lead to some complicated set of defining functions that exhibit large smooth 
isotropic regions within a complicated global structure which is beyond our visual hori- 
zon and unobservable (although not necessarily falsifiable within a particular cosmological 
model). However, despite the success of simple cosmological theories in explaining almost 
all that we see in the universe, it is clear that there is an under-determination problem: we 
cannot make enough observations to specify the structure of space-time and its contents, 
even on our past light cone, let alone beyond it. It is not a satisfactory methodology to 
use observations to construct a description of space-time. Rather, we proceed by creating 
parametrised descriptions that follow from solutions of Einstein’s equations (or some other 
theory) and then constrain the free parameters by using he observational data. Despite the 
widespread lip service paid to Popper’s doctrine of falsification as a scientific methodol- 
ogy, its weaknesses are especially clear in cosmology. It assumes that all observations and 
experimental results are correct and unbiased — that what you see is what you get. In prac- 
tice, they are not and you never know whether observational data are falsifying a theory, or 
are based on wrong measurements, or subject to some unsuspected selection effect [111]. 
All that observational science can ever do is change the likelihood of a particular theory 
being true or false. Sometimes the likelihood can build up (or down) to such an extent that 
we regard a theory being tested (like the expansion of the universe) as ‘true’ or (like cold 
fusion) as ‘false’. 
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6 
Emergent Structures of Effective Field Theories 


JEAN-PHILIPPE UZAN 


6.1 Introduction 


Science and philosophy have a strong interaction and an unquestionable complementarity 
but one needs to draw a clear line between what they tell on the world, on their scope and 
implications. 


6.1.1 Role of Cosmology 


Cosmology does indeed play an important role in this debate. It has long been part of 
mythology and philosophy and, during the twentieth century, adopted a scientific method, 
thanks to the observational possibilities offered by astrophysics. We can safely state that 
scientific cosmology was born with Einstein’s general relativity, a theory of gravitation that 
made space and time dynamical, a physical field that needs to be determined by solving 
dynamical equations known as Einstein field equations. The concept of spacetime was born 
with this theory, and in its early years cosmology was a space of freedom to think general 
relativity (Eisenstaedt, 1989). It led to the formulation of a ‘standard’ model, often referred 
to as the big bang model, that allows us to offer a history of the matter in the universe, in 
particular by providing an explanation for the synthesis of each of the elements of the 
Mendeleev table, and a history of the formation of the large scale structure. Cosmology 
has also excluded many possible models and hypotheses, such as the steady state model or 
topological defects as the seeds of the large scale structure. 

The current standard cosmological model can be summarised by Figure 6.1 from which 
one can easily conclude that all the phenomena that can be observed in our local universe, 
from primordial big bang nucleosynthesis (BBN) to today, rely mostly on general relativity, 
electromagnetism and nuclear physics, that is on physics below 100 MeV and well under 
control experimentally (see Ellis et al. (2012); Mukhanov (2005); Peter and Uzan (2009) 
for textbooks on modern cosmology). This is an important property since the fact that we 
can understand our observable universe does not rely on our ability to construct e.g. a 
theory of quantum gravity. This is the positive aspect, the negative one being that it may be 
very difficult to find systems in which gravity, quantum physics and observations appear 
together, inflationary perturbations and black holes being probably the only ones. At later 


109 


110 Jean-Philippe Uzan 


13,7 Gyrs )- 2.73 K i ------}-------------------------- eee  |-—> Supernovae 
iy inati Ss 
days Neosat ian aaah age Post recombination oe EL Lss 
universe FS 
400 m. yrs} ----- nn nr port prc r crt rdg 7-2E |——-> First stars 
Cc 
380 000 yrs }-- 10* K --7-S - ~~ |SITE RN |- ————> CMB 
Fe 
a 
Ed . 
2 Hot universe 
ne} 
2 
wk |§ 
®o ; 
ee sensuceas 5 a (ee eee eee aes fy ais —————> Nucleosynthesis 
1o"K JA 
a > 
Primordial universe Ni Microphysics 
y (reheationg/baryonegensis/...) not understood 
/ 
S S 
35 28 @ : Speculative 
10%s | 10°%K B Inflation >r microphysics 
Bs J 
i 
3 Microph d 
® : ; : icro| ics and structur 
7 (Multiverse/quantum gravity/ sa i tk param 
< : : A unknown 
Extra-dimensions... 


Figure 6.1 Summary of the history of our local universe in the standard cosmological model. It 
indicates the cosmic time elapsed after the big bang, as derived from the Friedmann equations, the 
temperature of the photon bath filling the universe. The local universe provides observations on phe- 
nomena from big bang nucleosynthesis to today spanning a range between 10-3 sto 13.7 Gyrs. One 
major transition is the equality which separates the universe in two eras: a matter-dominated era dur- 
ing which large scale structure can grow and a radiation-dominated era during which the radiation 
pressure plays a central role, in particular in allowing for acoustic waves. Equality is followed by 
recombination, which can be observed through the CMB anisotropies. For temperatures larger than 
10!! K, the microphysics is less understood and more speculative. Many phenomena such as baryo- 
genesis and reheating still need to be understood. Earlier, the microphysics relies on assumptions that 
may seem out of reach today. The whole history of our universe appears as a parenthesis of decel- 
erated expansion, during which complex structures can form, in between two periods of accelerated 
expansion, which do not allow for this complex structure to either appear or even survive. From Uzan 
(2013). 


time, the dynamics of the large scale structure becomes non-linear and requires the use 
of numerical simulations to be fully understood. At early time, the physics becomes more 
speculative. Extensions of the standard model of fields and interactions are then needed to 
understand baryogenesis, to construct the physics of the inflationary phase, etc. 

Both the existence of a structured and complex universe and the fact that we can under- 
stand it call for an explanation: our universe from | s after the big bang to today, 13.7 Gyr 
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after the big bang, can be described by a parenthesis between two inflationary phases dur- 
ing which the laws of physics that have been determined locally at low energy and linear 
regime are sufficient, making them easy to solve. 


6.1.2 Cosmological Models 


A cosmological model is a mathematical representation of our universe that is based on 
the laws of nature that have been derived and validated locally in our Solar system. It 
thus stands at the cross-road between theoretical physics and astronomy. Its construction 
depends on our knowledge of microphysics but also on a priori hypothesis on the geom- 
etry of the spacetime describing our universe. It follows that such a construction relies on 
four main hypotheses (see Uzan (2010) for a detailed description): (H1) a theory of grav- 
ity, (H2) a description of the matter contained in the universe and their non-gravitational 
interactions, (H3) symmetry hypothesis, and (H4) a hypothesis on the global structure, i.e. 
the topology, of the universe. These hypotheses are not on the same footing since (H1) 
and (H2) refer to the physical theories. These two hypotheses are however not sufficient 
to solve the field equations and we must make an assumption on the symmetries (H3) of 
the solutions describing our universe on large scales while (H4) is an assumption on some 
global properties of these cosmological solutions, with the same local geometry. 


6.1.3 Cosmic Extrapolations 


In such an approach, we need to push the theories and models that have been validated 
in our local neighbourhood beyond their domain of validity, in energy, space and/or time. 
In doing so, we usually adopt a ‘radical conservatist’ attitude, following the words by 
Wheeler (Deutsch, 1997), in which we continue to adopt our models and theories until 
we reach either an inconsistency with experiment or an internal theoretical inconsistency. 
The former clearly provides an experimental insight that our extrapolation is not correct 
while the second calls for new concepts to be forged in order to resolve inconsistencies that 
may appear due to the ‘collision’ of two theories (I detailed the way fundamental constants 
played a major role in this evolution in Uzan and Lehoucq (2005), which is unfortunately 
not translated in English). Indeed in such an approach, we tend to attribute a high credence 
to principles and mathematical structures that have proved to be very efficient to formulate 
the laws of nature locally. This high credence is not a proof of validity of the extrapolation 
but only reflects the fact that no catastrophe arises from using these theories far outside 
their domain of validity, and there is no need to invoke new structures or theories, hence 
the conservatism. 

As an example, any of the four hypotheses can be extrapolated independently, giving 
different images of the nature of the universe, all compatible with our knowledge of the 
local observable universe. For instance, it can be shown that as soon as topological struc- 
ture is invoked (hypothesis H4), hence leading to a universe of finite spatial volume, it 
cannot be distinguished from an infinite space as soon as its typical size is larger than 1.15 
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Figure 6.2 On the scales of the observable universe (circle), the acceleration of the universe can 
be explained by a cosmological constant in which case the construction of the cosmological model 
relies on the Copernican principle (upper left). To make sense of a cosmological constant, one argues 
one will introduce a large structure known as the multiverse (upper right) which can be seen as a 
collection of universes of all sizes and in which the values of the cosmological constant, as well 
as other constants, are randomised. The anthropic principle then states that we observe only those 
universes where the value of these constants are such that observers can exist. In this sense we have 
to abandon the Copernican principle on the scales of the multiverse. An alternative is to assume 
that there is no need for a cosmological constant or new physics, in which case we have to abandon 
the Copernican principle and assume e.g. that we are living in an under-dense region (lower left). 
However, we may recover the Copernican principle on larger scales if there exists a distribution of 
over- and under-dense regions of all sizes and densities on super-Hubble scales, without the need for 
a multiverse. In such a view, the Copernican principle will be violated on Hubble scale, just because 
we live in such a structure which happens to have a size comparable to the one of the observable 
universe. This latter view is now not favoured by observations. From Uzan (2010). 
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times the radius of the last scattering surface (Fabre et al., 2013). Above such a size, the 
extrapolation of an infinite universe is no more under control and cannot be distinguished 
observationally from a finite universe. Another example concerns the implications of the 
existence of a cosmological constant driving the late time acceleration of the expansion of 
our universe (see Figure 6.2). According to our credence on the validity of the Coperni- 
can principle (H3) and the scale on which it applies, one can argue for the existence of a 
multiverse, large-scale inhomogeneities or new physics (see e.g. Carr (2007) for a discus- 
sion of the different possibilities). The Copernican principle can be tested on the size of 
the observable universe (Caldwell and Stebbins, 2008; Goodman, 1995; Uzan et al., 2008) 
making the model of II.a of Figure 6.2 very constrained, hence decreasing significantly the 
credence of extrapolation II.b. But it does not increase the one of models I.b even though I.a 
gives a more accurate description of our local universe. Similarly, modifications of general 
relativity on large scale can be tested (Uzan and Bernardeau, 2001). 

In this process, we use different kinds of proofs. First, the historical proof consists of 
starting from our local observations, to organize them consistently in order to draw con- 
clusions on the conditions in the past. In this abductive inference, we can determine the 
scenario that maximises the consistency of the facts and its explicative power (in a class of 
scenarios that needs to be mathematically well defined). It follows that our conclusions are 
only probable. Second, the experimental or observational proofs aim at understanding the 
local laws and causes. They allow one to exclude hypotheses and can modify the inference 
of the historical abduction (see e.g. the discussion on constants below). Hence cosmology 
is a constant interplay between local and glocal features. It is important to explain to what 
extent our conclusions are robust. 

In cosmology, when confronted with an inconsistency, one always has to consider three 
options. One can either invoke the need for new physics or a modification of the laws of 
physics we have extrapolated (e.g. on large cosmological distance, low curvature regime, 
etc.) or have a more conservative attitude concerning fundamental physics and modify the 
cosmologogical hypothesis. The first option is indeed to be very conservative and doubt 
the observations (astrophysical solution). 

At this stage, we need to recall that it is central to distinguish the conclusions that have 
been validated by experiments or observations and extrapolations with high credence. 

Theoretical extrapolation is a valuable enterprise in order to understand the cartography 
of the space of possible worlds that a set of theories allows one to construct (this is indeed 
our favorite game). Usually, we are aware of the limitations of these speculations, because 
we know they are usually built on many speculative layers. As an internal debate, it causes 
no problem. But we need to be careful as to what is delivered to a larger audience, col- 
leagues from other fields and the larger public. In particular, drawn by extrapolations and 
theoretical stories on the different ways the universe may or may not be, public debates tend 
to slide to some relativism (sometimes even leading some extreme social constructivism to 
qualify the scientific method of myth). 

To that purpose, I think we need to reaffirm that science is first characterised by a 
methodological contract. It represents the minimum we have to follow in order to consider 
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our results as scientific. It can indeed be improved and is tacit between all of us. Our goal is 
to deliver to the public objective knowledge about the world. This implies that we need to 
adopt a materialistic method. For instance, we assume the existence of some reality, inde- 
pendent of our own existence and on us studying it; we approach it in a neutral way with 
the goal of revealing the way it appears to us, in its phenomena that are reproducible and 
testable. This materialism is a methodology and indeed not a philosophy nor the product of 
any philosophy. It is simply a methodology that protects us from relativism or spirituality 
that can alter or instrumentalise its productions. 

Indeed, science and philosophy have a strong relation. Scientific results set a passive 
constraint on any philosophy and philosophy helps us in coining our concepts. Needless 
to say there is porosity and that the boundary has been fluctuating over centuries. To the 
outside world, we need to make sure that science is first judged by its procedures rather than 
by its results, that our attitude does not consist in finding an explanation for ‘everything’ 
now but in classifying the problems and answering some well-defined questions, hence the 
notion of a good model associated to a domain of validity. If we are not able to formulate 
clear rules that dictate the construction of knowledge, then we will have to stick to facts and 
results rather than to methods and we may slowly slide into a debate between two beliefs. 
This knowledge belongs to the whole society (which is actually the reason I believe we 
have to be public servants!), so is freely available. 

In our debate, in a closed room (even though there was a camera, sic!) we can indeed 
draw personal metaphysical reflections based on our scientific constructions and what we 
have understood from the world. Interpretation of physical theories is a matter of debate 
and we shall not avoid it, the typical example being quantum mechanics. Interestingly, 
some theories give their own interpretation, as is the case of general relativity, while 
others still allow a huge number of possible interpretations leading to radically differ- 
ent metaphysical constructions. But we have to avoid confusion between the legitimacy 
of a personal metaphysical position and the consensus of an entire community. Cosmol- 
ogy (similarly to Darwinian evolution) because it entangles historical and experimental 
proofs, because people wrongly project hopes (and fears) on the fact that it can resolve 
some philosophical issues such as origin, is more exposed to relativism. 


6.1.4 Outline 


This text proposes a reflexion on these themes. To that purpose, I shall first, in Section 6.2, 
discuss the structures of physical theories, the implication of the existence of complexity 
and the notion of effective theories. Section 6.3 describes a recent example of how a math- 
ematical structure now taken for granted in theoretical physics, namely the existence of a 
Lorentzian metric, can be circumvented and how it leads to new possibilities concerning 
the notion of time. Section 6.4 considers the role of fundamental constants in the theories 
of physics, discussing their importance, the fact that they may be only effective quantities 
and then not constant as well as their implication for the fine-tuning argument. 
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6.2 On the Construction of Physical Theories 


The history of physics suggests that there are no mathematical structures that can be safely 
supposed to survive the evolution of our knowledge. 

Part of the art of theoretical physics is to find the mathematical structures that allow us to 
formalise and ‘decomplexify’ the laws of nature. These structures include the description 
of space-time (dimension, topology,...), of the matter and interactions (fields, symme- 
tries,...). While there is a large freedom in the choice of these mathematical structures, 
the developments of theoretical physics taught us that some of them are better suited to 
describe some classes of phenomena. But these choices are only validated by the mathe- 
matical consistency of the theory and, in the end, by the agreement of their consequences 
with experiments. Mathematics has also the power to reveal aspects of nature that are not 
accessible directly. This is the case of the Dirac equation that revealed the existence of 
anti-matter, from a pure mathematical consistency popping out of the equations. Indeed, 
experiments are then needed to check this insight. The magic here is that most of the 
time, such structures have turned out to exist. This power of mathematics has led some 
(wrongly) to think that it may be sufficient to stick to the mathematical consistency of the 
construction. 

Some structures, considered as fundamental in the domain of validity of a theory, can be 
replaced by other structures in a new description, e.g. the spacetime of Newtonian physics 
can be formally seen as the limit of the Minkowski structure in regimes where v < c. 
In this respect, the role of fundamental constants as concept synthesisers or as limiting 
quantities is central to signal the need for new mathematical structures outside a domain 
of validity (Uzan, 2003). At each step, some properties, such as the topology of space, 
the number of spatial dimensions or the numerical values of the free parameters that are 
the fundamental constants (Uzan, 2003), may remain a priori free in one framework, or 
imposed in another framework (e.g. the number of space dimensions is fixed in string 
theory (Polchinsky, 1998)). It may even be that different structures can reproduce what 
we know about physics and one has to rely on less-defined criteria, such as simplicity and 
economy, to choose between them. 

A natural question is to determine when a mathematical structure of a physical theory, 
that we know is limited in its validity, reveals one of the underlying structures of nature 
and can be given a high credence when we extrapolate the theory or can be generalised in 
the formulation of a more fundamental theory. 


6.2.1 Effective Theories 


The fact that we can understand the universe and its laws has a strong implication on the 
structure of the physical theories. At each step in their construction, we have been dealing 
with phenomena below a typical energy scale, for technological constraints, and it turned 
out (experimentally) that we have always been able to design a consistent theory valid 
in such a restricted regime. This is not expected in general and is deeply rooted in the 
mathematical structure of the theories that describe nature. 
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We can call such a property a scale decoupling principle and it refers to the fact that 
there exist energy scales below which effective theories are sufficient to understand a set of 
physical phenomena. Effective theories are the most fundamental concepts in the scientific 
approach to the understanding of nature and they always come with a domain of validity 
inside which they are efficient to describe all related phenomena. They are a successful 
explanation at a given level of complexity based on concepts of that particular level. It also 
means that they can only answer a limited set of questions and indeed cannot be blamed 
for this. For instance, we do not need to have understood and formulated string theory to 
formulate nuclear physics and we do not need to know anything about nuclear physics to 
develop atomic physics or chemistry, needless to say biology. Similarly electromagnetism 
describes how protons and electrons attract each other independently of the fact that we 
have understood, or not, that protons are made of quarks and gluons. This implies that 
the structures of the theories are such that there is a kind of stability and independence of 
higher levels with respect to more fundamental ones. This property is also important from 
an experimental point of view, since we try to restrict to a system that is decoupled as much 
as one can from its environment so that it can be assumed isolated. Truly isolated systems 
never exist and the effect of the environment may play an important role in cases such 
as quantum entanglement or decoherence. It follows that various disciplines have devel- 
oped independently in almost quasi-autonomous domains, each of them having its own 
ontology and dynamics that are independent of our ability to formulate a theory explaining 
these concepts. Sometimes two such effective theories will collide and show inconsistency 
that will need, in order to be resolved, to introduce new concepts, more fundamental, from 
which the concepts of each of the theories can be derived in a limiting behaviour, at least 
in principle (e.g. while the wave function of quantum mechanics has properties of waves 
and particles, one cannot state a priori when a photon will behave as a particle or a wave 
as in e.g. a Young experiment photon by photon, so that in general one simply needs to 
abandon the old and often intuitive concepts at the price of otherwise entering in endless 
debates). For example, Maxwell electromagnetism and Galilean kinematics are incompat- 
ible at high velocity, which is at the origin of special relativity with the new concept of 
spacetime, or the concept of wave function from which the preexisting concepts of particle 
and wave are limiting behaviours. Note that it implies that concepts that were thought to 
be incommensurable (such as space and time, or momentum and wave number) need to be 
unified, which is usually achieved by the introduction of new fundamental constants (speed 
of light, or Planck constant, in the two examples at hand) that were not considered as fun- 
damental (or even to exist) in the previous theories (see e.g. Uzan and Lehoucq (2005) for 
a full description of the role of constants in that context). 

It follows that the set of theories we are using to describe the world around us can be split 
into a hierarchy of levels of reality (see Figure 6.3) as characterised by their corresponding 
academic subjects (Ellis, 2005, 2012). We emphasise that this hierarchy is based on a 
hierarchy of explanations (space science is needed to do astronomy). It is however different 
to what happens dynamically since the formation of planets requires first stars to have 
existed. Note also that the right branch is related, from a spacetime point of view, to the 
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Figure 6.3 Hierarchy of theories in terms of their level of complexity as proposed by Ellis (2005, 
2012) rephrased to make explicit the bifurcation that appears at the level 5 of complexity at which 
one needs to distinguish mineral chemistry (5a) and organic chemistry (5b). At this transition a 
central emergent quantity, information, appears on the left branch leading to new phenomena such as 
reproduction-selection, evolution, life and consciousness. The development of the complexity levels 
of the right branch are however conditional to the one of the left branch. 


left since one needs stars/planets for life to develop (there is hierarchy in the formation of 
structure). 

Higher level behaviours are constrained from lower level laws from which they emerge. 
The relation between the different levels has the following properties (the way the different 
levels of complexity interact has been studied in depth by Ellis (2005, 2012)): 


e Higher level behaviours are constrained by the lower level laws from which they emerge. 
This is the usual bottom-up causation in which microscopic forces determine what hap- 
pens at the higher levels. The more fundamental gives the space of possibilities in which 
a higher level can develop, by constraining e.g. causality, the type of interactions or 
structures that can exist. As an example in nuclear physics, a free neutron is unstable 
and will decay into a proton. While a fundamental process of nuclear physics, it can be 
understood from the quark composition of the neutron and protons why neutrons decay 
into protons and not the reverse. 

e Scale separation implies that at each level of complexity, one can define fundamental 
concepts that are not affected from the fact that they may not be fundamental at a lower 
level (e.g. we can consider protons and neutrons as fundamental particles to describe 
many nuclear properties and forget that they are made of quarks). In that sense, much 
of the higher level phenomena remain quite independent of the microscopic structures, 
fields and interactions. 

e At least for the lowest levels, the fact that physical theories are renormalisable implies 
that they influence the higher levels mostly through some numbers. This is in partic- 
ular the case of the fundamental constants of a given effective theory. They cannot 
be explained within the framework of this particular theory since there exists nothing 
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more fundamental. They can however be replaced by more fundamental constants of an 
underlying level. For instance, the mass or the gyromagnetic factor of the proton are 
fundamental constants of nuclear physics. They can however be ‘explained’, even if the 
actual computation may be difficult (see Luo et al. (2011)), in terms of constants of the 
lower level (such as the quark masses, binding energies). This explanation of the con- 
stants of an effective theory may reveal new phenomena that cannot be dealt with before 
(e.g. the fundamental parameters of the effective theory may now be varying) but these 
phenomena have to be at the margin (or below the error bars) of the experiences that 
have validated the effective theory, since otherwise it would not have been a good theory 
(with reproducible predictions)! 

e Not all the concepts of a higher level can be explained in terms of lower level concepts. 
Each level may require its own concepts that do not exist, or even are not related, to lower 
level concepts. These are emergent properties so that the whole may not be understood in 
terms of its parts. A typical example is the notion of information that cannot be reduced 
to chemistry or physical concepts. The concept of temperature however can be explained 
in terms of lower level concepts such as mean kinetic energy. 

e The fact that there exists a lower level of complexity and thus microscopic degrees of 
freedom implies that these degrees of freedom can be heated up so that we expect to have 
apparitions of entropy and of dissipation. Black-hole entropy and Hawking temperature 
are one argument for the fact that gravity may not be a fundamental interaction. 

e The higher levels of complexity can backreact on the lower level. This is the notion of 
top-down causation. It can take different forms such as contextuality, selection effect, 
control loop, etc. (see Ellis (2005, 2012)). Note that the understanding of phenomena of 
the whole, i.e. of a higher level of complexity, contributes to the understanding of the 
properties and dynamics of the parts as they function in, and also allow for the existence 
of, the global structure. 


6.2.2 Emergence of Complexity in our Universe 


The universe is, at the moment, the largest embedding structure in which all phenomena of 
lower complexity levels do develop. It is thus important to describe how the structures of 
the universe and of its history offer the possibility for the emergence of other levels. The 
cosmological model describes an expanding universe cooling down and developing non- 
linear structures through the action of gravity, from an initial state at thermal equilibrium. 
We can thus summarise the evolution of our universe (see Figure 6.1) as 


(hot, homogenous, simple) — (cold, inhomogeneous, complex). 


In this particular case, most of the history of the universe relies on the properties of gen- 
eral relativity. The fact that it is universal, and the equivalence principle, tell us that the 
microscopic nature of the matter is (mostly) irrelevant to determine the large scale struc- 
ture of spacetime since all that matters is the averaged stress-energy tensor on the scales of 
interest. 
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In the early universe, higher levels of complexity cannot emerge, simply because the 
photon thermal bath is too energetic for complex structures to be stable. For instance, 
there cannot be any light nuclei before BBN so that at temperatures above 100 MeV there 
exist only free protons, neutrons, neutrinos, photons, electrons and positrons at thermal 
equilibrium under the action of the weak and electromagnetism interactions (see chapter 4 
of Peter and Uzan (2009)). Above 100 MeV, nuclear physics is not needed to describe the 
universe. In the earlier phases, prior to preheating the matter content of the universe was 
constituted of a scalar field, the inflaton, so that even particle physics was irrelevant. At 
preheating, one needs to have a description of particle physics to describe the universe 
(in particular, one needs to know the number of relativistic degrees of freedom, the mass 
thresholds, their coupling to the inflaton itself, etc...) Before recombination, the matter in 
the universe remains ionised so that there is atomic physics going on (even though atomic 
transitions are important to describe the process of recombination) and there cannot be 
any molecules at that stage since they would instantaneously be destroyed. To start any 
chemistry (level 5), one needs to wait for the first stars to synthetise elements heavier than 
the existing hydrogen, helium, beryllium and lithium. The transition to level 5 thus requires 
structures such as galaxies and first stars to start forming (in fact one even needs to wait 
longer since these heavier nuclei have to be released back to the instellar medium). This is 
summarised in Figure 6.4. 

It follows that no real complexity can emerge in the universe before it has cooled enough 
and before structures start forming. We can estimate that before a redshift of ~30 no level 
of complexity above 5b is actually reached in our universe. The complexity levels of the 
left branch start developing with the formation of the large scale structure. As shown in 
Figure 6.4 not only do the higher levels of complexity appear only at late time, but also in 
a very inhomogeneous way since they require the density of matter to be large enough for 
e.g. molecular interactions to be non-negligible. 

There is a feature that comes with the appearance of the different levels of complexity. 
At each step, only a tiny part of the matter content is involved in the next complexity level. 
This could be referred to as a decimation principle. For instance, before baryogenesis, mat- 
ter and anti-matter are in equilibrium. When they annihilate, only a tiny part of the matter 
content survives and most of it is turned into radiation hence increasing the entropy of 
the universe. Then, only a small part of the nuclei is able to participate to chemistry since 
hydrogen and helium constitute 75 per cent and 24 per cent, respectively. Then, only a tiny 
part of the molecules is organic and only a small part of it contributes to self-reproducing 
molecules. Whether this is of some relevance and a generic feature of the emergence of 
higher levels of complexity still needs to be investigated. 


6.2.3 Causality 


This structure of layers of theories of increasing complexity and of nested Russian 
babouchka dolls of effective theories has also implications for the notion of causaliy. 
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Figure 6.4 The different levels of complexity presented in Figure 6.3 can appear only in regions (and 
times) where the physical conditions allow for these complex structures, usually more fragile, to be 
stable. Before decoupling, i.e. 300,000 years after the big bang, temperature is too high for atoms to 
exist and only very light nuclei have been formed so that no molecule can exist. The universe remains 
mostly homogeneous, hot, with a very simple chemical composition and no level above 4 can emerge. 
Only after decoupling do the large scale structures start growing under the effect of gravity (since 
then radiation pressure has become negligible). The universe becomes inhomogeneous and locally 
galaxies, stars and then planets can form, allowing (1) for spacetime regions where complexity levels 
above 5b can be reached and (2) for a more variegated chemical composition with heavier nuclei and 


then molecules. From Uzan (2013). 


George Ellis (2005, 2012) has described how higher levels can backreact on the properties 
of lower levels by top-down action. There are however limitations on how it can proceed. 
Understanding this will teach us what to expect from an explanatory point of view from 
different effective physical theories. 

Below level 5, the causality of all theories is inherited from special relativity. Locally 
they all have a Lorentzian structure so that causality is dictated by the lightcone structure of 
the embedding spacetime. The physical conditions at a spacetime point M are dictated by 
initial conditions within the past lightcone and the equations of evolution. These theories 
are often asked to have a well-posed Cauchy problem, which ensures that any future field 
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configuration is completely determined by initial conditions and that the notion of ‘future 
configuration’ is globally well defined, which is thought as encompassing the notion of 
predictability (this can be actually an argument to exclude some theories; see e.g. Esposito- 
Farése et al. (2010); Fleury et al. (2014)). This property is related to the existence of 
a fundamental constant which characterises the maximum speed of propagation of any 
interaction identified to the speed of light. While this is the case with our current theories, 
it may not be the case for quantum theories of gravity in which Lorentz invariance may be 
broken, in some bimetric theories of gravity in the more speculative theories we describe 
in the next section. 

Above level 5, top-down action becomes efficient. In general, it implies that the 
conditions at a given spacetime position become mostly independent of the initial con- 
ditions (Ellis, 2005, 2012). As argued above, the appearance of this level of complexity is 
localised both in time and space, so that the region of the universe in which the top-down 
causation is efficient is limited. Indeed, it has to be limited by the usual relativistic causal- 
ity but it is expected that the typical propagation speed of the action of the higher levels 
on the lower ones is much smaller than the speed of light, so that the future lightcone of 
influence is expected to be very narrow. For example, in principle humanity can backreact 
on stellar physics to increase the lifetime of the Sun once it is understood that it can be 
performed by homogenising the Solar fuel. It is however obvious that the time to achieve 
the required technological evolution is much larger than 8 minutes. In the same way, the 
simple fact that the first artificial probes have been exiting the Solar system only recently 
demonstrates that the domain of influence of (human) life remains restricted to the Solar 
system. It is thus safe to neglect it for the evolution of our galaxy. Note however that it may 
well modify planetary science since terraformation of Mars is not completely out of reach. 


6.2.4 Mathematics 


Among the features of our theories is the constant use and efficiency of mathematics. 

The lowest levels of complexity enjoy the property that all elements are undistinguish- 
able, in the sense that any electron is strictly similar to any other electron. Moreover, 
there is a limited (and actually small) number of different building blocks. This implies 
that there exists a one-to-one mapping between the physical entities and their mathemati- 
cal descriptions. There is actually no need to distinguish them (even though caution is in 
order since the mathematical structure can be changed in case of the discovery of a new 
property). Once these structures are fixed, the way they can interact is actually fixed (by 
symmetries and consistency between the different structures), and causality is mainly fixed 
by the Lorentzian structure, as we discussed. We can thus conclude that, at these levels, 
mathematics are prescriptive. 

In higher levels of complexity, the situation is different because of combinatorics. One 
can construct about a hundred stable nuclei with many hundreds of isotopes, which leads 
to a number of molecules that cannot be estimated, up to macromolecules such as DNA. 
In terms of causality, it is impossible to get rid of randomness, even classically, since as it 
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appears the fortuitous conjunction at a spacetime point of two independent causal chains. 
It follows that we cannot associate a single mathematical structure to each physical entity. 
We can indeed model a class of them in terms of a mean individual and fluctuations. This is 
indeed a powerful procedure but one in which the mathematics are only descriptive. They 
become a powerful (or not!) tool. But the true mathematical structure of the underlying 
level is slowly hidden and diluted by complexity. 


6.3 Lorentzian Structure 


Among all the mathematical structures used in theoretical physics, and in the framework 
of metric theories of gravitation, the signature of the metric is in principle arbitrary. 


6.3.1 Lorentz Signature 


Indeed, it seems that on the scales that have been probed so far there is the need for only 
one time dimension and three spatial dimensions. It is also now universally accepted that 
the relativistic structure, and in particular as the cleanest way to implement the notion of 
causality, is a central ingredient of the construction of any realistic field theory. Spacetime 
enjoys a locally Minkowski structure. When gravity is included, the equivalence principle 
implies (this is not a theoretical requirement, but just an empirical fact, required at a given 
accuracy) that all the fields are universally coupled to the same Lorentzian metric. Thus, 
we now take it for granted that the spacetime is a four-dimensional (4D) manifold endowed 
with a metric of signature (—,+,-+,+). 

While the existence of two time directions may lead to confusions (Bars, 2001; Halsted, 
1982), several models for the birth of the universe (Friedman, 1998; Gibbons and Hartle, 
1990; Gott and Li, 1998; Hartle and Hawking, 1983) are based on a change of signature 
via an instanton in which a Riemannian and a Lorentzian manifold are joined across a 
hypersurface which may be thought of as the origin of time. While there is no time in the 
Euclidean region, where the signature is (+,-+,+,-+), it flips to (—,+,+, +). Eddington 
(1923) even suggested that it can flip across some surface to (—,—,+,+) and signature 
flips also arise in brane (Gibbons and Ishibashi, 2004; Mars et al., 2007) or loop quantum 
cosmology (Mielczarek, 2014). 

It is legitimate to investigate whether the signature of this metric is only a convenient 
way to implement causality or whether it is just a property of an effective description of 
a microscopic theory in which there is no such notion. What does it take to construct a 
realistic field theory in a positively definite metric? 


6.3.2 A Clock Field as a New Ingredient 


Let us start by physics in flat spacetime and consider a 4D Riemannian manifold M with 
a positive definite Euclidean metric om = dyy in Cartesian coordinates. The theories on 
this manifold have thus no natural concept of time. 
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Figure 6.5 Example of a spatial configuration of the clock field. Locally, one can define regions 
such as Mg, Mo and Mo , in each of which a time direction emerges. Indeed this direction does 
not preexist at the microscopic level and can be different from patches to patches. From Mukohyama 
and Uzan (2013). 


If one considers a scalar field x and assumes that its Lagrangian is a scalar and 
led to second-order equations, the only terms that can be included are a ‘kinetic term’, 
50H ay, X0yxX, and a potential term V(x). As a consequence its field equation is elliptic, 
determining a statical configuration. 

In order to make dynamics emerge locally, we introduce a scalar field @ such that: 
(1) its derivative has a non-vanishing vacuum expectation value in a region Mg of the 
Riemannian space (see Figure 6.5) and (2) it couples to all other fields. 

The first condition is implemented by assuming that 0,,6 = const. 0 in Mg. We can 
thus set 0,,¢ = Mn, on Mo with n, a unit constant vector. Its norm Xp = 6""0,o0)¢ = 
M* is constant and satisfies Xz > 0 on Mo. Now, under this assumption, one of the 
coordinates of the Euclidean space can be chosen as dt = n,dx", that is 


t 


? 
FV (6.1) 
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up to a constant that can be set to zero without loss of generality and we introduce three 
independent coordinates x! (i = 1...3) on the hypersurfaces ©, normal to n’. 

The second condition is implemented by considering that x couples only to 0,,@ so that 
one can consider the action 


Ts. 1 2 
Sy = [e's |= 58 daxdx — V(x) + re (5! d.@dvx) ‘ (6.2) 
On Mo where the condition (6.1) holds, it is straightforward to deduce that 
3 1 v 
S, —> | ditd°x = IuxXIvxX —V (6.3) 
3 1 v 
= dtd?x |} —=n"" On xd.x —V}. (6.4) 
Mo 2 


Hence, on Mo, the action (6.2) describes the dynamics of a scalar field propagating in an 
effective 4D Minkowski spacetime with metric n,,, = diag(—1, +1, +1, +1). The apparent 
Lorentzian dynamics, with a preferred time direction, results from the coupling to ¢. As 
a consequence, ¢ is related to what we usually call ‘time’, so that we shall call it a clock 
field. 


6.3.3 Classical Field Theory in Flat Spacetime 


In the previous example, the clock field allows for the emergence of an effective Lorentzian 
dynamics because the scalar field is actually coupled to the effective metric g4” = 64” — 
gas SHA SYP JePdg¢ that reduces on Mo to Nyy. 

This construction can actually be generalised easily to vector fields and Dirac 
spinors (Mukohyama and Uzan, 2013), and to Majora and Weyl spinors (Kehayias ef al., 
2014), which allows us (Mukoyama and Uzan, 2013) to construct the full action of the 
standard model of particle physics in flat spacetime at the classical level. 


6.3.4 Gravity and Physics in Curved Spacetime 


It is even possible to extend this construction to gravity (Mukohyama and Uzan, 2013). 

To this purpose, we shall consider a general 4D Riemannian manifold M with a positive 
definite metric ae To minimise the number of degrees of freedom, we demand that the 
equation of motion for ¢ is second order. Hence, its action is restricted to the Riemannian 
version of the Horndeski theory (Horndeski, 1974) with shift symmetry. Equivalently, it is 
given by the shift-symmetric generalised Galileon (Deffayet et al., 2011). 

For the effective equations, i.e. once the Lorentzian structure has emerged, we would like 
to ensure that the system is invariant not only under time translation but also under CPT. 
For this reason, we require that besides the shift symmetry (6 — #-+ const.) the theory 
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also enjoys a Z) symmetry (¢ — —@) for the clock field action. With these symmetries, 
the Riemannian action reduces to 


Se = ( as J | GalKRe + K (Xp) 
— 264,(Xe) | (Veo)? — (WEVEO)*| | (6.5) 


where Xp = Se On Pdv@. 
As demonstrated in Mukohyama and Uzan (2013), it reduces on Mo to 


5, = f atxy=2 | roor + 27'co [v9 - cw" V"4N(V,.1.0)] 
+PX)} . (6.6) 


in terms of X = —g"0,,@0,@ where gy, is the emergent Lorentzian metric and where 
f and P are two functions related to G4 and K. This action (6.6) is a special case of the 
covariant Galileon (Deffayet et al., 2011) and the equations of motion are second order. 
It can also be shown (Mukohyama and Uzan, 2013) that the action for scalar and vector 
fields can easily be generalised to curved spacetime, while the case of spinor is still an 
open question. 


6.3.5 Discussion 


This shows that by introducing a coupling of the standard fields to a clock field, we have 
shown that a Lorentzian dynamics can emerge on a patch Mo of a Riemannian space, 
including gravity. This goes far beyond earlier attempts (Girelli et al., 2009). 

It is important to emphasise that: (1) this construction is, for now, limited to classi- 
cal field theories; (2) when restricted to Mo all fields propagate in the same effective 
Minkowski metric so that the equivalence principle is safe in first approximation; (3) indeed 
the couplings to the clock field have been tuned for that purpose. Indeed the action (6.2) 
could have been chosen as 


ss [e's [-Z atau xavx =VO)+ a (5% 3,03 x)’ | , 
and a Lorentzian signature is recovered only if a, > ky > 0. In the case where these 
constants are not tuned, different fields can have different lightcones. (4) In the bosonic 
sector, since the theory is invariant under the Euclidean parity (x“ — —x) and field parity 
(@ — —@), both P and T invariances in the Lorentzian theory are ensured. This explains 
why we have included only quadratic terms in 0,,¢ in the actions for scalars and vectors. (5) 
In the fermionic sector, one of the coupling terms is not CPT invariant after the clock field 
has a vacuum expectation value. (6) The configuration of the clock field is not arbitrary but 
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determined by solving its equation of motion. Since its action enjoys a shift symmetry, it 
will take the form of a current conservation. 

In curved spacetime, gravity takes the form of a covariant Galilean theory that depends 
on two free functions constrained by the stability of a Friedmann—Lemaitre background 
with respect to linear scalar and tensor perturbations. For the matter sector, the actions 
for scalar and vector fields are easily generalised and each depends on two free parame- 
ters («k, @) that are allowed to be functions of Xg in general but may as well be assumed 
constant. Besides, there is an environmental parameter which characterises the clock field 
configuration on the patch Mo. The emergent model has the following properties: (1) it 
induces two components which respectively behave as dark matter and dark energy when 
considering the dynamics of a homogeneous cosmology. This sets two constraints for the 
cosmology to be consistent with standard cosmology, at least at the background level. (2) 
In general, scalars and vectors propagate in two different effective metrics. In order for 
the weak equivalence principle to hold, we have to impose that these two metrics coin- 
cide. In the simplest situation in which the coeficients (k,@) are assumed to be constant, 
one only requires a tuning on the parameters of the Lagrangians, but then it is satisfied 
whatever the configuration of the clock field. In this sense the tuning is not worse than 
the usual assumption that all the fields propagate in the same metric. (3) In general, the 
parameters entering our effective Lorentzian actions are environmentally determined. This 
means that if Xg is not strictly constant on Mo, fundamental constants may be spacetime 
dependent, which is strongly constrained (Uzan, 2003, 2005, 2009, 2011). (4) The speed of 
light and of graviton may not coincide, which is constrained by the observations of cosmic 
rays (Moore and Nelson, 2001) because particles propagating faster than the gravity waves 
emit gravi-Cerenkov radiation. 

To conclude, from a theoretical point of view, such a construction gives a new insight 
into the need for Lorenzian metric as a fundamental structure. As we have shown, this is 
not a mandatory requirement and a decent field theory, at least at the classical level, can be 
constructed from a Riemannian metric. Such a formalism may be fruitful in the debate on 
the emergence of time and, speculating, for the development of quantum gravity. 

It also opens up a series of questions and possibilities. We can list (1) the development 
of a quantum theory, (2) the possibility from the classical viewpoint that singularities in 
our local Lorentzian region may be related to singularities in the clock field (e.g. similar 
to topological defects) and not in the metric of the Euclidean theory. These are, for now, 
speculations but they illustrate that this framework may be fruitful for extending our current 
field theories, including general relativity. 


6.4 Fundamenal Constants 


Fundamental constants play an important role in physics. In particular, they set the order 
of magnitude of phenomena; they allow one to forge new concepts; they characterise the 
domain of validity of the theory in which they appear. 
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In gravitation, their constancy is related to the validity of the Einstein equivalence prin- 
ciple and, in cosmology, they play a central role in multiverse construction since they are 
used to quantify the level of fine-tuning. 


6.4.1 Role in Physical Theories 


We shall start by defining the fundamental constants of a theory as any parameter that is 
not determined by the theories we are using. 

These parameters have to be assumed constant from a theoretical point of view; since 
they enjoy no equation of evolution they cannot be expressed in terms of more fundamental 
quantities so that they can only be measured in the laboratory. From an experimental point 
of view, the reproducibility of experiments that have been used to validate the theories in 
which they appear ensures that this is a good approximation at the level of accuracy of these 
experiments on the time scales they span. This also means that testing for the constancy of 
these parameters is a test of the theories in which they appear and allows one to extend the 
knowledge of their domain of validity, at the limit of what metrology can achieve. 

The number of fundamental constants depends on what we consider as being the fun- 
damental theory of physics. Today, gravitation is described by general relativity, and the 
three other interactions and whole fundamental fields are described by the standard model 
of particle physics. In such a framework, one has 22 unknown constants (the Newton con- 
stant, six Yukawa couplings for the quarks and three for the leptons, the mass and vacuum 
expectation value of the Higgs field, four parameters for the Cabibbo—Kobayashi—Maskawa 
(CKM) matrix, three coupling constants, a UV cut-off to which one must add the speed of 
light and the Planck constant (Ellis and Uzan, 2005; Hogan, 2000; Uzan, 2003)). 

Indeed, when introducing new, more unified or more fundamental theories the number 
of constants may change so that the list of what we call fundamental constants is a time- 
dependent concept and reflects both our knowledge and ignorance (Weinberg, 1983). For 
instance, we know today that neutrinos have to be somewhat massive. This implies that the 
standard model of particle physics has to be extended and that it will involve at least seven 
more parameters (three Yukawa couplings and four CKM parameters). On the other hand, 
this number can decrease, e.g. if the non-gravitational interactions are unified. In such a 
case, the coupling constants may be related to a unique coupling constant wy and a mass 
scale of unification My through a; '(E) — ay + (b;/27) In(My/E), where the b; are 
numbers which depend on the explicit model of unification. This would also imply that the 
variations, if any, of various constants will be correlated. 

The tests of the constancy of fundamental constants take all their importance in the 
realm of the tests of the equivalence principle (Will, 1993). This principle, which states the 
universality of free fall, the local position invariance and the local Lorentz invariance, is at 
the basis of all metric theories of gravity and implies that all matter fields are universally 
coupled to a unique metric g,,, which we shall call the physical metric, Smatter(W, 2yv)- 
The dynamics of the gravitational sector is dictated by the Einstein—Hilbert action Sgray = 
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__ | J—8«: d+x. General relativity assumes that both metrics coincide, Suv = ae 
which implements the equivalence principle in its strong form. 

The test of the constancy of constants is obviously a test of the local position invariance 
hypothesis and thus of the equivalence principle. Let us also emphasise that it is deeply 
related to the universality of free fall (Dicke, 1964) since if any constant c; is a spacetime 
dependent quantity so will be the mass of any test particle. It follows that the action for a 
point particle of mass ma is given by 


Spp. = = f mateje —yuv(x)vlurdt 


with v“ = dx /dt so that its equation of motion is 


d lnm, 
OC; 


V>ju! == ( vee} (gF! + wut). (6.7) 


It follows that a test body does not enjoy a geodesic motion and experience an anomalous 


acceleration which depends on the sensitivity f4; = d|lnm,/dc; of the mass mg to a 
variation of the fundamental constants c;. In the Newtonian limit, g99 = —1 + 2®j// Cc 
so that a = gy + da, with the anomalous acceleration day, = —¢ ; SA (Voi + sai). 


Such deviations are strongly constrained in the Solar system and also allow us to bound 
the variation of the constants (Dent, 2007, 2008). 

There are many ways to design a theory with dynamical constant. As long as one sticks 
to field theories, the recipe is simple: one needs to promote a constant of the theory to 
the status of a dynamical field, allowing e.g. for a kinetic term and a potential. The cou- 
pling to the “standard fields’ can be guessed from the theory in which these quantities are 
constants but, indeed the functional forms of the couplings remain arbitrary. This has two 
consequences. First, the equations derived with this parameter constant will be modified 
and one cannot just let it vary in the equations derived by assuming it is constant. Second, 
the variation of the action with respect to this new field provides a new equation describing 
the evolution for this new parameter (i.e. of the constant). The field responsible for the 
time variation of the 1£j1£jconstanti£jI1£; is also responsible for a long-range (composition- 
dependent) interaction, i.e. at the origin of the deviation from General Relativity, because 
of the coupling of the standard matter fields. 


6.4.2 Constraints on Their Time Variation 


Since any physical measurement reduces to the comparison of two physical systems, one of 
them often used to realise a system of units, it only gives access to dimensionless numbers. 
This implies that only the variation of dimensionless combinations of the fundamental 
constants can be measured and would actually also correspond to a modification of the 
physical laws (see e.g. Uzan (2003, 2011)). Changing the value of some constants while 
letting all dimensionless numbers remain unchanged would correspond to a change of 
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Figure 6.6 Summary of the constraints on the time variation of the fundamental constants. It depicts 
the various systems in the spacetime diagram with our past lightcone. The left bar gives the typical 
magnitude of the constraint on the relative variation of the fine structure constant —5 meaning < 
10~> etc.) and the red bar a forecast of the expected improvment in the coming years. From Uzan 
(2011). 


units. It follows that from the 22 constants of our reference model, we can pick three 
of them to define a system of units (such as e.g. c, G and h to define the Planck units) 
so that we are left with 19 unexplained dimensionless parameters, characterising the mass 
hierarchy, the relative magnitude of the various interactions, etc. Indeed, sometimes one 
refers to a variation of a constant with dimension, typically G. This is dangerous but usually 
one has been setting the units in such a way that this constant is dimensionless. For instance 
a varying G theory is meant as a theory in which Gm /he is varying and all mass ratios 
are kept constant (see the recent criticism by Duff (2014) showing it may not be clear for 
everyone). 

The various physical systems that have been considered can be classified in many 
ways (Uzan, 2003, 2011). 

First, we can classify them according to their look-back time and more precisely their 
spacetime position relative to our actual position. This is summarised in Figure 6.6 which 
represents our past lightcone, the location of the various systems (in terms of their redshift 
z) and the typical level at which they constrain the time variation of the fine structure 
constant. These systems include atomic clock comparisons (z = 0, Blatt et al. (2008); 
Bize et al. (2003); Cing6z et al. (2008)), the Oklo phenomenon (z ~ 0.14, Damour and 
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Table 6.1. Summary of the systems considered to set constraints on the variation of the 
Jundamental constants. We summarise the observable quantities (see text for details), the 
primary constants used to interpret the data and the other hypotheses required for this 
interpretation. [a: fine structure constant; {L: electron-to-proton mass ratio; g]: gyromagnetic 
factor; E;: resonance energy of the samarium-149; i: lifetime; Bp: deuterium binding energy; 
Qnp: neutron-proton mass difference; t: neutron lifetime; me: mass of the electron; my: mass of 
the nucleon]. 


System Observable Primary constraints Other hypothesis 
Atomic clock dinv gyda, - 

Oklo phenomenon isotopic ratio Ey geophysical model 
Meteorite dating isotopic ratio Xr - 

Quasar spectra atomic spectra Sp, hs cloud properties 
21cm Th 8p, sO cosmological model 
CMB T [L, cosmological model 
BBN light element abundances Qnp; T,Me,my,a,Bp cosmological model 


Dyson (1996); Kuroda (1956)), meteorite dating (z ~ 0.43, Olive et al. (2004); Peebles and 
Dicke (1962)), both having a spacetime position along the world line of our system and not 
on our past lightcone, quasar absorption spectra (z = 0.2 — 4, Chand et al. (2004, 2005); 
Petitjean et al. (2009); Webb et al. (1999, 2001)), population III stars (Ekstrom et al., 2010) 
cosmic microwave background anisotropy (z ~ 10°; Ade et al. (2014)) and primordial 
nucleosynthesis (z ~ 108, Coc et al. (2006, 2007)). Indeed higher redshift systems offer 
the possibility to set constraints on a larger time scale, but at the price of usually involving 
other parameters such as the cosmological parameters. This is particularly the case of the 
cosmic microwave background and primordial nucleosynthesis, the interpretation of which 
requires a cosmological model. 

The systems can also be classified in terms of the physics they involve in order to be 
interpreted (see Table 6.1). For instance, atomic clocks, quasar absorption spectra and the 
cosmic microwave background require only the use of quantum electrodynamics to draw 
the primary constraints, so that these constraints will only involve the fine structure con- 
stant a, the proton-to-electron mass ratio jz and the various gyromagnetic factors g7. On the 
other hand, the Oklo phenomenon, meteorite dating and nucleosynthesis require nuclear 
physics and quantum chromodynamics to be interpreted. 

For any system, setting constraints goes through several steps that we sketch here. First, 
any system allows us to derive an observational or experimental constraint on an observable 
quantity O(G,x, X), which depends on a set of primary physical parameters G; and a set of 
external parameters X, that usually are physical parameters that need to be measured or 
constrained (e.g. temperature,...). These external parameters are related to our knowledge 
of the physical system and the lack of their knowledge is usually referred to as systematic 
uncertainty. 
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From a physical model of the system one can deduce the sensitivities of the observables 
to an independent variation of the primary physical parameters 


_ dInO 
~ aInG,. 


KG (6.8) 
As an example, the ratio between various atomic transitions can be computed from quan- 
tum electrodynamics to deduce that the ratio of two hyperfine-structure transitions depends 
only on g; and a while the comparison of fine-structure and hyperfine-structure transi- 
tions depends on g;, aw and jw. For instance (Karshenboim, 2006) vcs/VRb « ee and 
Vcs/VH x gcspwa 3, 

The primary parameters are usually not fundamental constants (e.g. the resonance 
energy of the samarium E,. for the Oklo phenomenon, the deuterium binding energy Bp 
for nucleosynthesis, etc.) The second step is thus to relate the primary parameters to (a 
choice of) fundamental constants c;. This would give a series of relations 


AlnG, = » dy A Incj. (6.9) 


I 


The determination of the parameters d;; requires first to choose the set of constants c; (do 
we stop at the masses of the proton and neutron, or do we try to determine the depen- 
dencies on the quark masses, or on the Yukawa couplings and Higgs vacuum expectation 
value, etc.? See e.g. Dent (2007) for various choices) and also requires us to deal with 
nuclear physics and the intricate structure of quantum chromodynamics (QCD). In partic- 
ular, the energy scale of QCD, Agcp, is so dominant that at lowest order all parameters 
scale as Aécp so that the variation of the strong interaction would not affect dimensionless 
parameters and one has to take the effect of the quark masses. 


6.4.3 Fine Tuning 


Fundamental constants are the centre of interest of the fine-tuning argument, according to 
which the laws of nature have to be adjusted in order for complexity to exist. 

As we have seen in the previous section, the previous study on the variation of the 
fundamental constants allows us to quantify the fine tuning. For example, the study of 
population III stars (Ekstr6m et al., 2010) shows that the strength of the nuclear interaction 
cannot vary by more that 1/1000 for carbon to be produced. 

The question which I am not going to address here is when do we start worrying about 
such a fine tuning? I refer to the extensive discussion by Barnes (2012) and its references. 

Among the solutions that cannot be denied, the multiverse scenario is usually proposed. 
It is based on the idea of a meta-theory (often thought of as string theory, but not nec- 
essarily) that leads to a distribution of universes with different low energy physical laws 
(usually involving some hypothesis on eternal inflation) and an observer selection effect 
(as the anthropic principle). In doing so, one needs to define a notion of different universes 
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Figure 6.7 (Left) A model accounting for a spatial variation of the fundamental constants (Olive 
et al., 2011). Given the relative sizes of the correlation length of the field that undergoes the phase 
transition and the Hubble radius, one can either observe several patches (micro-landscape) or be 
contained in a single patch (landscape). While the former can be constrained observationally the 
second is out of reach of experiment. From Uzan (2012). 


or universes similar to ours. The study of the fundamental constants shows that a typi- 
cal variation of 10~7-10~> of the masses and couplings leads to universes similar to ours 
(i.e. in principle able to host life forms similar to ours) but that can be observationally 
distinguished. This sets some coarse-graining scales in the space of models. 

Many models can lead to a spatial distribution of the fundamental constants. Here again, 
the same microphysics can lead to very different scenarios. For instance, I want to stress 
that the same model (Olive ef al., 2011) can account for a spatial distribution of the constant 
on sub or super-Hubble scales (see Figure 6.7) while invoking no other universes. 


6.5 Conclusion 


The interactions between science and philosophy are so rich and numerous that it would 
be a pity not to stimulate our thoughts reciprocally. As I have argued, being a theoretical 
physicist and a cosmologist, my first duty is to deliver knowledge that has been validated 
(i.e. George Ellis’s cosmology vs. cosmologia discussion, in this volume) and that may (or 
may not) set passive constraints on philosophical theories. It also requires an ontology that 
cannot be found in the theory itself and for which philosophy is of great help. In cosmology, 
we have to be clear on the level of credence of the models we are considering, and we must 
make clear the layers of speculative theories they rely on, and confuse the results of our 
extrapolations to validate results. Once models have been excluded, we also have to make 
explicit on what piece of observational data. 

I have given two examples of theoretical constructions that challenge what we usually 
assume on the Lorentzian structure of spacetime and on the nature of the fundamental 
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constants. Both can be motivated by theoretical arguments. But, more important, I think 
they show how local physics can affect the construction of a cosmological model and that 
the space of possible theories is always larger that what we tend to assume. 
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Cosmological Structure Formation 


JOEL R. PRIMACK 


7.1 Introduction 


Cosmology has finally become a mature science during the past decade, with predictions 
now routinely confirmed by observations. The modern cosmological theory is known as 
ACDM — CDM for cold dark matter, particles that moved sluggishly in the early uni- 
verse and thereby preserved fluctuations down to small scales (Blumenthal et al., 1984, see 
Figure 7.1), and A for the cosmological constant (e.g. Lahav ef al., 1991). A wide variety 
of large-scale astronomical observations — including the Cosmic Microwave Background 
(CMB), measurements of baryon acoustic oscillations (BAO), gravitational lensing, the 
large-scale distribution of galaxies, and the properties of galaxy clusters — agree very well 
with the predictions of the ACDM cosmology. 

Like the standard model of particle physics, the ACDM standard cosmological model 
requires the determination of a number of relevant cosmological parameters, and the the- 
ory does not attempt to explain why they have the measured values — or to explain the 
fundamental nature of the dark matter and dark energy. These remain challenges for the 
future. But the good news is that the key cosmological parameters are now all determined 
with unprecedented accuracy, and the six-parameter ACDM theory provides a very good 
match to all the observational data including the 2015 Planck temperature and polarization 
data (Planck Collaboration ef al., 2015a). Within uncertainties less than 1 per cent, the 
Universe has critical cosmic density — i.e., Qtotal = 1.00 and the Universe is Euclidean 
(or “flat") on large scales. The expansion rate of the Universe is measured by the Hubble 
parameter h = 0.6774 + 0.0046, and Qmatter = 0.3089 + 0.0062; this leads to the age 
of the Universe fg = 13.799 + 0.021 Gyr. The power spectrum normalization parame- 
ter is og = 0.816 + 0.009, and the primordial fluctuations are consistent with a purely 
adiabatic spectrum of fluctuations with a spectral tilt n; = 0.968 + 0.006, as predicted 
by single-field inflationary models (Planck Collaboration et al., 2015a). The same cosmo- 
logical parameters that are such a good match to the CMB observations also predict the 
observed distribution of density fluctuations from small scales probed by the Lyman alpha 
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Figure 7.1 The origin of the CDM spectrum of density fluctuations. Left panel: Fluctuations cor- 
responding to mass scales 10°Mo, 10°Mo, etc., grow proportionally to the square of scale factor 
a when they are outside the horizon, and when they enter the horizon (cross the horizontal dashed 
line) the growth of the fluctuation amplitude 5 is much slower if they enter when the Universe is 
radiation dominated (i.e. @ < deq). Fluctuations on mass scales > 10>Mo enter the horizon after it 
becomes matter dominated, so their growth is proportional to scale factor a; that explains the larger 
separation between amplitudes for such higher-mass fluctuations. Right panel: The resulting CDM 
fluctuation spectrum («3/2 |5,.| = AM /M). This calculation assumed that the primordial fluctuations 
are scale-invariant (Zel’dovich) and that Qmatter = 1 and Hubble parameter h = 1. (From a 1983 
conference presentation Primack and Blumenthal (1984), reprinted in Primack (1984).) 


forest! to the entire horizon, as shown in Figure 7.2. The near-power-law galaxy—galaxy 
correlation function at low redshifts is now known to be a cosmic coincidence (Watson 
et al., 2011). I was personally particularly impressed that the evolution of the galaxy— 
galaxy correlations with redshift predicted by ACDM (Kravtsov ef al., 2004) turned out to 
be in excellent agreement with the subsequent observations (e.g. Conroy et al., 2006). 

For non-astronomers, there should be a more friendly name than ACDM for the standard 
modern cosmology. Since about 95 per cent of the cosmic density is dark energy (either a 
cosmological constant with Q, = 0.69 or some dynamical field that plays a similar cosmic 
role) and cold dark matter with Qcpm = 0.31, I recommend the simple name “Double 
Dark Theory" for the modern cosmological standard model (Primack and Abrams, 2006; 
Abrams and Primack, 2011). The contribution of ordinary baryonic matter is only Qp = 
0.05. Only about 10 per cent of the baryonic matter is in the form of stars or gas clouds that 
emit electromagnetic radiation, and the contribution of what astronomers call “metals" — 
chemical elements heavier than helium — to the cosmic density is only Qmetals * 0.0005, 
most of which is in white dwarfs and neutron stars (Fukugita and Peebles, 2004). The 
contribution of neutrino mass to the cosmic density is 0.002 < &, < 0.005, far greater 


! The Lyman forest is the many absorption lines in quasar spectra due to clouds of neutral hydrogen along the 
line of sight to the quasar. 
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Figure 7.2 The r.m.s. mass variance AM/M predicted by ACDM compared with observations, from 
CMB and the Atacama cosmology telescope (ACT) on large scales, brightest cluster galaxy weak 
lensing, clusters, the SDSS galaxy distribution, to the Lyman alpha forest on small scales. This figure 
highlights the consistency of power spectrum measurements by an array of cosmological probes over 
a large range of scales. (Redrawn from Figure 5 in Hlozek et al. (2012), which gives the sources of 
the data.) 


than Qmetals. Thus our bodies and our planet are made of the rarest form of matter in the 
universe: elements forged in stars and stellar explosions. 

Potential challenges to ACDM on large scales come from the tails of the predicted distri- 
bution functions, such as CMB cold spots and massive clusters at high redshifts. However, 
the existing observations appear to be consistent thus far with predictions of standard 
ACDM with standard primordial power spectra; non-Gaussian initial conditions are not 
required (Planck Collaboration et al., 2015b). Larger surveys now underway may provide 
more stringent tests. 


7.2 Large-Scale Structure 


Large, high-resolution simulations permit detailed predictions of the distribution and prop- 
erties of galaxies and clusters. From 2005 to 2010, the benchmark simulations were 
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Millennium-I (Springel et al., 2005) and Millennium-II (Boylan-Kolchin et al., 2009), 
which have been the basis for more than 400 papers. However, these simulations used first- 
year Wilkinson microwave anisotropy probe (WMAP) cosmological parameters, including 
og = 0.90, that are now in serious disagreement with the latest observations. Improved 
cosmological parameters, simulation codes, and computer power have permitted more 
accurate simulations (Kuhlen et al., 2012; Skillman et al., 2014) including Bolshoi (Klypin 
et al., 2011) and BigBolshoi/MultiDark (Prada et al., 2012; Riebe et al., 2013), which have 
recently been rerun using the Planck cosmological parameters (Klypin et al., 2016). 

Dark matter halos can be characterized in a number of ways. A common one is by mass, 
but the mass attributed to a halo depends on a number of factors including how the outer 
edge of the halo is defined; popular choices include the spherical radius within which the 
average density is either 200 times critical density or the virial density (which depend on 
redshift). Properties of all the halos in many stored time steps of both the Bolshoi and Big- 
Bolshoi/MultiDark simulations are available on the web in the MultiDark database.” For 
many purposes it is more useful to characterize halos by their maximum circular velocity 
Vmax, Which is defined as the maximum value of [GM(< r)/ r|'/2, where G is Newton’s 
constant and M(< r) is the mass enclosed within radius r. The reason this is useful is that 
Vmax 1s reached at a relatively low radius “pax, closer to the central region of a halo where 
stars or gas can be used to trace the velocity of the halo, while most of the halo mass is 
at larger radii. Moreover, the measured internal velocity of a galaxy (line of sight velocity 
dispersion for early-type galaxies and rotation velocity for late-type galaxies) is closely 
related to its luminosity according to the Faber—Jackson and Tully—Fisher relations. In 
addition, after a subhalo has been accreted by a larger halo, tidal stripping of its outer parts 
can drastically reduce the halo mass but typically decreases Vingx much less. (Since the stel- 
lar content of a subhalo is thought to be determined before it was accreted, some authors 
define Vmax to be the peak value at any redshift for the main progenitor of a halo.) Because 
of the observational connection between larger halo internal velocity and brighter galaxy 
luminosity, a common simple method of assigning galaxies to dark matter halos and subha- 
los is to rank order the galaxies by luminosity and the halos by Vmax, and then match them 
such that the number densities are comparable (Kravtsov ef al., 2004; Tasitsiomi ef al., 
2004; Conroy et al., 2006). This is called “halo abundance matching" or (more modestly) 
“sub-halo abundance matching" (SHAM) (Reddick et al., 2014). Halo abundance match- 
ing using the Bolshoi simulation predicts galaxy—galaxy correlations (which are essentially 
counts of the numbers of pairs of galaxies at different separation distances) that are in good 
agreement with the Sloan Digital Sky Survey (SDSS) observations (Trujillo-Gomez et al., 
2011; Reddick et al., 2013). 

Abundance matching with the Bolshoi simulation also predicts galaxy velocity—mass 
scaling relations consistent with observations (Trujillo-Gomez et al., 2011), and a galaxy 


2 The web address for the MultiDark simulation data center is www.cosmosim.org/cms/simulations/ multidark- 
project/; more detailed analyses of the Bolshoi-Planck and MultiDark-Planck simulations are available at 
www.hipacc.ucsc.edu/Bolshoi/MergerTrees.html. 
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velocity function in good agreement with observations for maximum circular velocities 
Vmax ~ 100 km/s, but higher than the HI Parkes All Sky Survey (HIPASS) and the Arecibo 
Legacy Fast ALFA (ALFALFA) Survey radio observations (Zwaan et al., 2010; Papaster- 
gis et al., 2011) by about a factor of 2 at 80 km/s and a factor of 10 at 50 km/s. This 
either means that these radio surveys are increasingly incomplete at lower velocities, or else 
ACDM is in trouble because it predicts far more small-Vinax halos than there are observed 
low-V galaxies. A deeper optical survey out to 10 Mpc found no disagreement between 
Vinax predictions and observations for Vinax => 60 km/s, and only a factor of 2 excess of 
halos compared to galaxies at 40 km/s (Klypin et al., 2015). This may indicate that there 
is no serious inconsistency with theory, since for V ~ 30 km/s reionization and feedback 
can plausibly explain why there are fewer observed galaxies than dark matter halos (Bul- 
lock et al., 2000; Somerville, 2002; Benson et al., 2003; Kravtsov, 2010; Wadepuhl and 
Springel, 2011; Sawala et al., 2012), and also the observed scaling of metallicity with 
galaxy mass (Dekel and Woo, 2003; Woo et al., 2008; Kirby et al., 2011). 

The radial dark matter density distribution in halos can be approximately fit by the sim- 
ple formula pyrw = 495x7!(1 + x)~?, where x = r/rs (Navarro et al., 1996), and the 
“concentration” of a dark matter halo is defined as C = Ryir/Rs where Ryjir is the virial 
radius of the halo. When we first understood that dark matter halos form with relatively 
low concentration C ~ 4 and evolve to higher concentration, we suggested that “red” 
galaxies that shine mostly by the light of red giant stars because they have stopped forming 
stars should be found in high-concentration halos while “blue” galaxies that are still form- 
ing stars should be found in younger low-concentration halos (Bullock et al., 2001). This 
idea was recently rediscovered by Hearin and Watson, who used the Bolshoi simulation to 
show that this leads to remarkably accurate predictions for the correlation functions of red 
and blue galaxies (Hearin and Watson, 2013; Hearin et al., 2014). 

The Milky Way has two rather bright satellite galaxies, the Large and Small Magellanic 
Clouds. It is possible using sub-halo abundance matching with the Bolshoi simulation to 
determine the number of Milky-Way-mass dark matter halos that have subhalos with high 
enough circular velocity to host such satellites. It turns out that about 55 per cent have no 
such subhalos, about 28 per cent have one, about 11 per cent have two, and so on (Busha 
et al., 201 1a). Remarkably, these predictions are in excellent agreement with an analysis of 
observations by the Sloan Digital Sky Survey (SDSS) (Liu er al., 2011). The distribution 
of the relative velocities of central and bright satellite galaxies from SDSS spectroscopic 
observations is also in very good agreement with the predictions of the Millennium-II sim- 
ulation (Tollerud et al., 2011), and the Milky Way’s lower-luminosity satellite population is 
not unusual (Strigari and Wechsler, 2012). Considered in a cosmological context, the Mag- 
ellanic clouds are likely to have been accreted within about the last Gyr (Besla et al., 2012), 
and the Milky Way halo mass is 1.2+}*{(stat.)£0.3(sys.)x 10!7Mo (Busha et al., 201 1b). 


7.3 Galaxy Formation 


At early times, for example the CMB epoch about 400,000 years after the big bang, or 
on very large scales at later times, linear calculations starting from the ACDM fluctuation 
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Figure 7.3 The stellar disk of a large spiral galaxy like the Milky Way is about 100,000 light years 
across, which is tiny compared with the dark matter halo of such a galaxy (from the Aquarius dark 
matter simulation, Springel et al., 2008), and even much smaller compared with the large-scale 
cosmic web (from the Bolshoi simulation, Klypin et al., 2011). 


spectrum allow accurate predictions. But on scales where structure forms, the fluctuations 
have grown large enough that they are strongly non-linear, and we must resort to simula- 
tions. The basic idea is that regions that start out with slightly higher than average density 
expand a little more slowly than average because of gravity, and regions that start out 
with slightly lower density expand a little faster. Non-linear structure forms by the process 
known by the somewhat misleading name “gravitational collapse” — misleading because 
what really happens is that when positive fluctuations have grown sufficiently that they are 
about twice as dense as typical regions their size, they stop expanding while the surround- 
ing universe keeps expanding around them. The result is that regions that collapse earlier 
are denser than those that collapse later; thus galaxy dark matter halos are denser than clus- 
ter halos. The visible galaxies form because the ordinary baryonic matter can radiate away 
its kinetic energy and fall toward the centers of the dark matter halos; when the ordinary 
matter becomes dense enough it forms stars. Thus visible galaxies are much smaller than 
their host dark matter halos, which in turn are much smaller than the large scale structure 
of the cosmic web, as shown in Figure 7.3. 
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Astronomical observations represent snapshots of moments long ago when the light we 
now observe left distant astronomical objects. It is the role of astrophysical theory to pro- 
duce movies — both metaphorical and actual — that link these snapshots together into a 
coherent physical picture. To predict cosmological large-scale structures, it has been suf- 
ficient to treat all the mass as dark matter in order to calculate the growth of structure 
and dark matter halo properties. But hydrodynamic simulations — i.e. including baryonic 
matter — are necessary to treat the formation and evolution of galaxies. 

An old criticism of ACDM has been that the order of cosmogony is wrong: halos grow 
from small to large by accretion in a hierarchical formation theory like ACDM, but the 
oldest stellar populations are found in the most massive galaxies — suggesting that these 
massive galaxies form earliest, a phenomenon known as “downsizing” (Cowie et al., 1996). 
The key to explaining the downsizing phenomenon is the realization that star formation is 
most efficient in dark matter halos with masses in the band between about 10!° and 10!*Mo 
(Figure | bottom in Behroozi et al., 2013). This goes back at least as far as the original Cold 
Dark Matter paper (Blumenthal et al., 1984): see Figure 7.4. A dark matter halo that has the 
total mass of a cluster of galaxies today will have crossed this star-forming mass band at an 
early epoch, and it will therefore contain galaxies whose stars formed early. These galaxies 
will be red and dead today. A less massive dark matter halo that is now entering the star- 
forming band today will just be forming significant numbers of stars, and it will be blue 
today. The details of the origin of the star-forming band are still being worked out. Back in 
1984, we argued that cooling would be inefficient for masses greater than about 10!*7Mo 
because the density would be too low, and inefficient for masses less than about 10°Mo 
because the gas would not be heated enough by falling into these small potential wells. 
Now we know that reionization, supernovae (Dekel and Silk, 1986), and other energy input 
additionally impedes star formation for halo masses below about 10!°Mo, and feedback 
from active galactic nuclei (AGN) additionally impedes star formation for halo masses 
above about 10!7Mo. 

Early simulations of disk galaxy formation found that the stellar disks had much lower 
rotation velocities than observed galaxies (Navarro and Steinmetz, 2000). This problem 
seemed so serious that it became known as the “angular momentum catastrophe”. A major 
cause of this was excessive cooling of the gas in small halos before they merged to form 
larger galaxies (Maller and Dekel, 2002). Simulations with better resolution and more 
physical treatment of feedback from star formation appear to resolve this problem. In 
particular, the Eris cosmological simulation (Guedes ef al., 2011) produced a very real- 
istic spiral galaxy, as have many simulations since then. Somerville and Davé (2015) is 
an excellent recent review of progress in understanding galaxy formation. In the following 
I summarize some of the latest developments. There are now two leading approaches to 
simulating galaxies: 


e Low resolution, ~1 kiloparsec. The advantages of this approach are that it is possi- 
ble to simulate many galaxies and study galaxy populations and their interactions with 
the circumgalactic and intergalactic media, but the disadvantages are that we learn 
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Figure 7.4 The Star-Forming Band on a diagram of baryon density np versus the three-dimensional 
r.m.s. velocity dispersion V and virial temperature T for structures of various sizes in the universe, 
where T = uV"/ 3k, is mean molecular weight (~ 0.6 for ionized primordial H + He) and k is 
Boltzmann’s constant. Below the No Metals and Solar Metals cooling curves, the cooling timescale 
is more rapid than the gravitational timescale. Dots are groups and clusters. Diagonal lines show the 
halo masses in units of Me. (This is Figure 3 in Blumenthal et al., 1984, with the Star-Forming Band 
added.) 


relatively little about how galaxies themselves form and evolve at high redshifts. The 
prime examples of this approach now are the //lustris (Vogelsberger et al., 2014b) and 
EAGLE (Schaye et al., 2015) simulations. Like semi-analytic models of galaxy for- 
mation (reviewed in Benson, 2010), these projects adjusted the parameters governing 
star-formation and feedback processes in order to reproduce key properties of galaxies 
at the present epoch, redshift z = 0. The J/lustris simulation in a volume (106.5 Mpc)? 
forms ~40,000 galaxies at the present epoch with a reasonable mix of elliptical and spi- 
ral galaxies that have realistic appearances (Snyder ef al., 2015), obey observed scaling 
relations, and have the observed numbers of galaxies as a function of their luminos- 
ity, and were formed with the observed cosmic star formation rate (Vogelsberger et al., 
2014a). It forms massive compact galaxies by redshift z = 2 via central starbursts in 
major mergers of gas-rich galaxies or else by assembly at very early times (Wellons 
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et al., 2015). Remarkably, the //ustris simulation also predicts a population of damped 
Lyman a absorbers (DLAs, small-galaxy-size clouds of neutral hydrogen) that agrees 
with some of the key observational properties of DLAs (Bird et al., 2014, 2015). The 
EAGLE simulation in volumes up to (100 Mpc)? reproduces the observed galaxy mass 
function from 10° to 10!'Mo at a level of agreement close to that attained by semi- 
analytic models (Schaye et al., 2015), and the observed atomic and molecular hydrogen 
content of galaxies out to z ~ 3 (Rahmati et al., 2015; Lagos et al., 2015). 

High resolution, ~ 10s of parsecs. The advantages are that it is possible to compare 
simulation outputs in detail with high-redshift galaxy images and spectra to discover 
the drivers of morphological changes as galaxies evolve, but the disadvantage is that 
such simulations are so expensive computationally that it is as yet impossible to run 
enough cases to create statistical samples. Leading examples of this approach are FIRE 
simulations led by Phil Hopkins (e.g. Hopkins er al., 2014) and the ART simulation 
suite led by Avishai Dekel and myself (e.g. Zolotov et al., 2015). We try to compensate 
for the small number of high-resolution simulations by using simulation outputs to tune 
semi-analytic models, which in turn use cosmological dark-matter-only simulations like 
Bolshoi to follow the evolution of ~ 10° galaxies in their cosmological context (e.g. 
Porter et al., 2014a, 2014b; Brennan ef al., 2015). 


The high-resolution FIRE simulations, based on the GIZMO smooth particle hydro- 
dynamics code (Hopkins, 2015) with supernova and stellar feedback, including radiative 
feedback (RF) pressure from massive stars, treated with zero adjusted parameters, repro- 
duce the observed relation between stellar and halo mass up to Mhalo ~ 10?°Mo and the 
observed star formation rates (Hopkins ef al., 2014). FIRE simulations predict covering 
fractions of neutral hydrogen with column densities from 10'7cm~* (Lyman limit sys- 
tems, LLS) to > 10293cm7? (DLAs) in agreement with observations at redshifts z=2-2.5 
(Faucher-Giguére et al., 2015); this success is a consequence of the simulated galactic 
winds. FIRE simulations also correctly predict the observed evolution of the decrease of 
metallicity with stellar mass (Ma et al., 2015), and produce dwarf galaxies that appear to 
agree with observations (Ofiorbe ef al., 2015) as we will discuss in more detail below. 

The high-resolution simulation suite based on the ART adaptive mesh refinement (AMR) 
approach (Kravtsov ef al., 1997; Ceverino and Klypin, 2009) incorporates at the sub-grid 
level many of the physical processes relevant for galaxy formation. Our initial group of 30 
zoom-in simulations of galaxies in dark matter halos of mass (1 — 30) x 10!*Mo at redshift 
z = | were run at 35-70 pe maximum (physical) resolution (Ceverino ef al., 2012, 2015a). 
The second group of 35 simulations (VELAO1 to VELA35) with 17.5 to 35 pc resolution 
of halos of mass (2 — 20) x 10!'Mo at redshift z = 1 have now been run three times 
with varying inclusion of radiative pressure feedback (none, UV, UV+IR), as described in 
Ceverino ef al. (2014). RF pressure including the effects of stellar winds (Hopkins et al., 
2012, 2014) captures essential features of star formation in our simulations. In particular, 
RF begins to affect the star-forming region as soon as massive stars form, long before 
the first supernovae occur, and the amount of energy released in RF greatly exceeds that 
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Figure 7.5 Face-on images of Vela26 simulated galaxy with UV radiation pressure feedback, at four 
redshifts: (a) z = 3.6 when it is diffuse and star forming (dSF); (b) z = 2.7 when it has become 
compact and star forming (cSF) with a red ex-situ clump; (ce) z = 2.3 still CSF, now with in situ clumps 
apparent in the V-band image; (d) compact and quenched (cQ) during a minor merger, with tidal 
features visible in the V-band image. Top panels: three-color composite images at high resolution; 
bottom panels: CANDELized V and H band images. The observed V band images correspond to 
ultraviolet radiation from massive young stars in the galaxy rest frame, while the observed H band 
images show optical light from the entire stellar population including old stars. The CANDELS 
survey took advantage of the infrared capability of the Wide Field Camera 3, installed on the last 
service visit to HST in 2009. 


released by supernovae (Ceverino ef al., 2014; Trujillo-Gomez ef al., 2015). In addition 
to radiation pressure, the local UV flux from young star clusters also affects the cooling 
and heating processes in star-forming regions through photoheating and photoionization. 
We use our Sunrise code (Jonsson, 2006; Jonsson et al., 2006, 2010; Jonsson and Primack, 
2010) to make realistic images and spectra of these simulated galaxies in many wavebands 
and at many times during their evolution, including the effects of stellar evolution and of 
dust scattering, absorption, and re-emission, to compare with the imaging and photometry 
from CANDELS? and other surveys — see Figure 7.5 for examples including the effect 
of CANDELization (reducing the resolution and adding noise) to allow direct comparison 
with holographic spacetime (HST) images. 

In comparing our simulations with HST observations, especially those from the CAN- 
DELS and 3D-HST surveys, we are finding that the simulations can help us interpret a 
variety of observed phenomena that we now realize are important in galaxy evolution. One 
is the formation of compact galaxies. Analysis of CANDELS images suggested (Barro 
et al., 2013, 2014a,b) that diffuse star-forming galaxies become compact galaxies (“blue 
nuggets”) which subsequently quench (“red nuggets’’). We see very similar behavior in our 
VELA simulations with UV radiative feedback (Zolotov et al., 2015, see Figure 2), and 
we have identified in our simulations several mechanisms that lead to compaction often 
followed by rapid quenching, including major gas-rich mergers, disk instabilities often 


3 CANDELS, the Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey, was the largest-ever 
Hubble Space Telescope survey, see http://candels.ucolick.org/. 
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triggered by minor mergers, and opposing gas flows into the central galaxy (Danovich 
et al., 2015). 

Another aspect of galaxy formation seen in HST observations is massive star-forming 
clumps (Guo ef al., 2012; Wuyts et al., 2013, and references therein), which occur in a 
large fraction of star-forming galaxies at redshifts z = 1 — 3 (Guo et al., 2015). In our sim- 
ulations there are two types of clumps. Some are a stage of minor mergers — we call those 
ex situ clumps. A majority of the clumps originate in situ from violent disk instabilities 
(VDD in gas-rich galaxies (Ceverino et al., 2012; Moody et al., 2014; Mandelker et al., 
2014). Some of these in situ clumps are associated with gas instabilities that help to create 
compact spheroids, and some form after the central spheroid and are associated with the 
formation of surrounding disks. We find that there is not a clear separation between these 
processes, since minor mergers often trigger disk instabilities in our simulations (Zolotov 
et al., 2015). 

Star-forming galaxies with stellar masses M, < 3 x 10°?Mo at z > 1 have recently 
been shown to have mostly elongated (prolate) stellar distributions (van der Wel et al., 
2014) rather than disks or spheroids, based on their observed axis ratio distribution. In 
our simulations this occurs because most dark matter halos are prolate especially at small 
radii (Allgood et al., 2006), and the first stars form in these elongated inner halos; at lower 
redshifts, as the stars begin to dominate the dark matter, the galaxy centers become disky 
or spheroidal (Ceverino ef al., 2015b). 

Both the FIRE and ART simulation groups and many others are participating in the 
Assembling Galaxies of Resolved Anatomy (AGORA) collaboration (Kim et al., 2014) to 
run high-resolution simulations of the same initial conditions with halos of masses 10!9, 
10!!, 10!7, and 10!°Mo at z = 0 with as much as possible the same astrophysical assump- 
tions. AGORA cosmological runs using different simulation codes will be systematically 
compared with each other using a common analysis toolkit and validated against obser- 
vations to verify that the solutions are robust — i.e. that the astrophysical assumptions are 
responsible for any success, rather than artifacts of particular implementations. The goals 
of the AGORA project are, broadly speaking, to raise the realism and predictive power of 
galaxy simulations and the understanding of the feedback processes that regulate galaxy 
“metabolism”. 

It still remains to be seen whether the entire population of galaxies can be explained 
in the context of ACDM. A concern regarding disk galaxies is whether the formation 
of bulges by both galaxy mergers and secular evolution will prevent the formation of as 
many pure disk galaxies as we see in the nearby universe (Kormendy and Fisher, 2008). 
A concern regarding massive galaxies is whether theory can naturally account for the rela- 
tively large number of ultra-luminous infrared galaxies. The bright sub-millimeter galaxies 
were the greatest discrepancy between our semi-analytic model predictions compared with 
observations out to high redshift (Somerville et al., 2012). This could possibly be explained 
by a top-heavy stellar initial mass function, or perhaps more plausibly by more realis- 
tic simulations including self-consistent treatment of dust (Hayward ef al., 2011, 2013). 
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Clearly, there is much still to be done, both observationally and theoretically. It is pos- 
sible that all the potential discrepancies between ACDM and observations of relatively 
massive galaxies will be resolved by a better understanding of the complex astrophysics 
of their formation and evolution. But small galaxies might provide more stringent tests of 
ACDM. 


7.4 Smaller Scale Issues: Cusps 


Cusps were perhaps the first potential discrepancy pointed out between the dark matter 
halos predicted by CDM and the observations of small galaxies that appeared to be domi- 
nated by dark matter nearly to their centers (Flores and Primack, 1994; Moore, 1994). Pure 
dark matter simulations predicted that the central density of dark matter halos behaves 
roughly as p ~ r—!. As mentioned above, dark matter halos have a density distribution 
that can be roughly approximated as pyrw = Aya! d+ x)7?, where x = r/rs (Navarro 
et al., 1996). But this predicted r~! central cusp in the dark matter distribution seemed 
inconsistent with published observations of the rotation velocity of neutral hydrogen as a 
function of radius. 

In small galaxies with significant stellar populations, simulations show that central star- 
bursts can naturally produce relatively flat density profiles (Governato et al., 2010, 2012; 
Pontzen and Governato, 2012; Teyssier et al., 2013; Brooks, 2014; Brooks and Zolotov, 
2014; Madau ef al., 2014; Pontzen and Governato, 2014; Ofiorbe et al., 2015; Nipoti 
and Binney, 2015). Gas cools into the galaxy center and becomes gravitationally domi- 
nant, adiabatically pulling in some of the dark matter (Blumenthal et al., 1986; Gnedin 
et al., 2011). But then the gas is driven out very rapidly by supernovae and the entire cen- 
tral region expands, with the density correspondingly dropping. Several such episodes can 
occur, producing a more or less constant central density consistent with observations, as 
shown in Figure 7.6. The figure shows that galaxies in the THINGS sample are consis- 
tent with ACDM hydrodynamic simulations. But simulated galaxies with stellar mass less 
than about 3 x 10°Mo may have cusps, although Ofiorbe et al. (2015) found that stellar 
effects can soften the cusp in even lower-mass galaxies if the star formation is extended 
in time. The observational situation is unclear. In Sculptor and Fornax, the brightest dwarf 
spheroidal satellite galaxies of the Milky Way, stellar motions may imply a flatter central 
dark matter radial profile than p ~ r— : (Walker and Pefiarrubia, 2011; Amorisco and Evans, 
2012; Jardel and Gebhardt, 2012). However, other papers have questioned this (Jardel and 
Gebhardt, 2013; Breddels and Helmi, 2013, 2014; Richardson and Fairbairn, 2014). 

Will baryonic effects explain the radial density distributions in larger low surface bright- 
ness (LSB) galaxies? These are among the most common galaxies. They have a range of 
masses but many have fairly large rotation velocities indicating fairly deep potential wells, 
and many of them may not have enough stars for the scenario just described to explain 
the observed rotation curves (Kuzio de Naray and Spekkens, 2011). Can we understand 
the observed distribution of the A;/2 measure of central density (Alam et al., 2002) and 
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Figure 7.6 Dark matter cores are generated by baryonic effects in galaxies with sufficient stellar 
mass. The slope a of the dark matter central density profile r® is plotted vs. stellar mass measured 
at 500 parsecs from simulations described in Pontzen and Governato (2012). The solid NFW curve 
assumes the halo concentrations given by Maccio et al. (2007). Large crosses: halos with > 5 x 10° 
dark matter particles; small crosses: > 5 x 104 particles. Squares represent galaxies observed by The 
HI Nearby Galaxy Survey (THINGS). (Figure 3 in Pontzen and Governato, 2014.) 


the observed diversity of rotation curves (Maccio et al., 2012b; Oman et al., 2015)? This 
is a challenge for galaxy simulators. 

Some authors have proposed that warm dark matter (WDM), with initial velocities large 
enough to prevent formation of small dark matter halos, could solve some of these prob- 
lems. However, that does not appear to work: the systematics of galactic radial density 
profiles predicted by WDM do not at all match the observed ones (Kuzio de Naray ef al., 
2010; Macci6 et al., 2012a, 2013). WDM that is warm enough to affect galaxy centers may 
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not permit early enough galaxy formation to reionize the universe (Governato et al., 2015). 
Yet another constraint on WDM is the evidence for a great deal of dark matter substructure 
in galaxy halos (Zentner and Bullock, 2003), discussed further below. 


7.5 Smaller Scale Issues: Satellite Galaxies 


As the top panel of Figure 7.3 shows, ACDM predicts that there are many fairly massive 
subhalos within dark matter halos of galaxies like the Milky Way and the Andromeda 
galaxy, more than there are observed satellite galaxies (Klypin et al., 1999; Moore et al., 
1999). This is not obviously a problem for the theory since reionization, stellar feedback, 
and other phenomena are likely to suppress gas content and star formation in low-mass 
satellites. As more faint satellite galaxies have been discovered, especially using multicolor 
information from SDSS observations, the discrepancy between the predicted and observed 
satellite population has been alleviated. Many additional satellite galaxies are predicted to 
be discovered by deeper surveys (e.g. Bullock ef al., 2010), including those in the Southern 
Hemisphere seen by the Dark Energy Survey (The DES Collaboration ef al., 2015) and 
eventually the Large Synoptic Survey Telescope (LSST). 

However, a potential discrepancy between theory and observations is the “too big to fail” 
(TBTF) problem (Boylan-Kolchin ef al., 2011, 2012). The Via Lactea-II high-resolution 
dark-matter-only simulation of a Milky Way size halo (Diemand et al., 2007, 2008) and 
the six similar Aquarius simulations (Springel et al., 2008) all have several subhalos that 
are too dense in their centers to host any observed Milky Way satellite galaxy. The brightest 
observed dwarf spheroidal (dSph) satellites all have 12 < Vinax < 25 km/s. But the Aquar- 
ius simulations predict at least ten subhalos with Vinax > 25 km/s. These halos are also 
among the most massive at early times, and thus are not expected to have had their star for- 
mation greatly suppressed by reionization. They thus appear to be too big to fail to become 
observable satellites (Boylan-Kolchin et al., 2012). 

The TBTF problem is closely related to the cusp-core issue, since TBTF is alleviated 
by any process that lowers the central density and thus the internal velocity of satellite 
galaxies. Many of the papers finding that baryonic effects remove central cusps cited in 
the previous section are thus also arguments against TBTF. A recent simulation of regions 
like the Local Group found the number, internal velocities, and distribution of the satellite 
galaxies to be very comparable with observations (Sawala et al., 2014). 

Perhaps there is additional physics beyond ACDM that comes into play on small scales. 
One possibility that has been investigated is warm dark matter (WDM). A simulation like 
Aquarius but with WDM has fewer high-Vinax halos (Lovell et al., 2012). But it is not clear 
that such WDM simulations with the lowest WDM particle mass allowed by the Lyman 
alpha forest and other observations (Viel et al., 2013; Horiuchi et al., 2014) will have 
enough substructure to account for the observed faint satellite galaxies (e.g. Polisensky 
and Ricotti, 2011), and as already mentioned WDM does not appear to be consistent with 
observed systematics of small galaxies. 
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Another possibility is that the dark matter particles interact with themselves much more 
strongly than they interact with ordinary matter (Spergel and Steinhardt, 2000). There are 
strong constraints on such self-interacting dark matter (SIDM) from colliding galaxy clus- 
ters (Harvey et al., 2015; Massey et al., 2015), and in hydrodynamic simulations of dwarf 
galaxies SIDM has similar central cusps to CDM (Bastidas Fry et al., 2015). SIDM can be 
velocity-dependent, at the cost of adding additional parameters, and if the self-interaction 
grows with an inverse power of velocity the effects can be strong in dwarf galaxies (Elbert 
et al., 2015). An Aquarius-type simulation but with velocity-dependent SIDM produced 
subhalos with inner density structure that may be compatible with the bright dSph satel- 
lites of the Milky Way (Vogelsberger et al., 2012). Whether higher-resolution simulations 
of this type will turn out to be consistent with observations remains to be seen. 


7.6 Smaller Scale Issues: Dark Matter Halo Substructure 


The first strong indication of galaxy dark matter halo substructure was the flux ratio anoma- 
lies seen in quadruply imaged radio quasars (“radio quads”) (Metcalf and Madau, 2001; 
Dalal and Kochanek, 2002; Metcalf and Zhao, 2002). Smooth mass models of lensing 
galaxies can easily explain the observed positions of the images, but the predictions of 
such models of the corresponding fluxes are frequently observed to be strongly violated. 
Optical and X-ray quasars have such small angular sizes that the observed optical and X-ray 
flux anomalies can be caused by stars (“microlensing’’), which has allowed a measurement 
of the stellar mass along the lines of sight in lensing galaxies (Pooley ef al., 2012). But 
because the quasar radio-emitting region is larger, the observed radio flux anomalies can 
only be caused by relatively massive objects, with masses of order 10° to 10°Mo along 
the line of sight. After some controversy regarding whether ACDM simulations predict 
enough dark matter substructure to account for the observations, the latest papers concur 
that the observations are consistent with standard theory, taking into account uncertainty 
in lens system ellipticity (Metcalf and Amara, 2012) and intervening objects along the 
line of sight (Xu ef al., 2012, 2015). But this analysis is based on a relatively small num- 
ber of observed systems (Table 2 of Chen ef al. (2011) lists the ten quads that have been 
observed in the radio or mid-IR), and further observational and theoretical work would be 
very helpful. 

Another gravitational lensing indication of dark matter halo substructure consistent with 
ACDM simulations comes from detailed analysis of galaxy—galaxy lensing (Vegetti et al., 
2010, 2012, 2014), although much more such data will need to be analyzed to get strong 
constraints. Other gravitational lensing observations including time delays can probe the 
structure of dark matter halos in new ways (Keeton and Moustakas, 2009). Hezaveh et al. 
(2013, 2014) show that dark matter substructure can be detected using spatially resolved 
spectroscopy of gravitationally lensed dusty galaxies observed with ALMA. Nierenberg 
et al. (2014) demonstrate that subhalos can be detected using strongly lensed narrow-line 
quasar emission, as originally proposed by Moustakas and Metcalf (2003). 
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The great thing about gravitational lensing is that it directly measures mass along the line 
of sight. This can provide important information that is difficult to obtain in other ways. 
For example, the absence of anomalous skewness in the distribution of high redshift type 
Ia supernovae brightnesses compared with low redshift ones implies that massive compact 
halo objects (MACHOs) in the enormous mass range 10~7 to 10!°Mo cannot be the main 
constituent of dark matter in the universe (Metcalf and Silk, 2007). The low observed 
rate of gravitational microlensing of stars in the Large and Small Magellanic clouds by 
foreground compact objects implies that MACHOs in the mass range between 0.6 x 1077 
and 15Mo cannot be a significant fraction of the dark matter in the halo of the Milky Way 
(Tisserand et al., 2007). Gravitational microlensing could even detect free-floating planets 
down to 1078 Mo, just 1 per cent of the mass of the earth (Strigari et al., 2012). 

A completely independent way of determining the amount of dark matter halo substruc- 
ture is to look carefully at the structure of dynamically cold stellar streams. Such streams 
come from the tidal disruption of small satellite galaxies or globular clusters. In numeri- 
cal simulations, the streams suffer many tens of impacts from encounters with dark matter 
substructures of mass 10° to 10’Mo during their lifetimes, which create fluctuations in the 
stream surface density on scales of a few degrees or less. The observed streams contain just 
such fluctuations (Yoon ef al., 2011; Carlberg, 2012; Carlberg et al., 2012; Carlberg and 
Grillmair, 2013), so they provide strong evidence that the predicted population of subha- 
los is present in the halos of galaxies like the Milky Way and M31. Comparing additional 
observations of dynamically cold stellar streams with fully self-consistent simulations will 
give more detailed information about the substructure population. The Gaia spacecraft’s 
measurements of the positions and motions of vast numbers of Milky Way stars will be 
helpful in quantifying the nature of dark matter substructure (Ngan and Carlberg, 2014; 
Feldmann and Spolyar, 2015). 


7.7 Conclusions 


ACDM appears to be extremely successful in predicting the cosmic microwave back- 
ground and large-scale structure, including the observed distribution of galaxies both 
nearby and at high redshift. It has therefore become the standard cosmological framework 
within which to understand cosmological structure formation, and it continues to teach us 
about galaxy formation and evolution. For example, I used to think that galaxies are pretty 
smooth, that they generally grow in size as they evolve, and that they are a combination of 
disks and spheroids. But as I discussed in Section 7.3, HST observations combined with 
high-resolution hydrodynamic simulations are showing that most star-forming galaxies 
are very clumpy; that galaxies often undergo compaction, which reduces their radius and 
greatly increases their central density; and that most lower-mass galaxies are not spheroids 
or disks but are instead elongated when their centers are dominated by dark matter. 
ACDM faces challenges on smaller scales. Although starbursts can rapidly drive gas 
out of the central regions of galaxies and thereby reduce the central dark matter density, 
it remains to be seen whether this and/or other baryonic physics can explain the observed 
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rotation curves of the entire population of dwarf and low surface brightness (LSB) galax- 
ies. If not, perhaps more complicated physics such as self-interacting dark matter may be 
needed. But standard ACDM appears to be successful in predicting the dark matter halo 
substructure that is now observed via gravitational lensing and stellar streams, and any 
alternative theory must do at least as well. 
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8 
Formation of Galaxies 


JOSEPH SILK 


8.1 Introduction 


Galaxies are key elements of the universe. They probe cosmology, they control our exis- 
tence. The broad lines of their formation and evolution are clear. Beginning as infinitesimal 
density fluctuations, in the early universe, leaving the observed relic pattern of temperature 
fluctuations on the last scattering surface of the CMB, galaxy halos grew via gravitational 
instability of cold weakly interacting dark matter, within which baryons dissipated and 
cooled into the observed galaxies. We are piecing together the missing steps, that involve 
assembly of massive halos from a hierarchy of merging subhalos. Memory remains in mas- 
sive halos of the substructure forged by gravity: this has been one of the major revelations 
to come from computer simulations of structure formation in the expanding universe. 

A major advance in demonstrating that galaxies formed via gravitational instability in 
the early universe came with the discovery of fine-scale angular fluctuations in the CMB. 
These were predicted as essential relics if galaxies had indeed formed by the conjectured 
instability. Prior to their discovery, one had no idea of the initial conditions for seeding 
structure formation. Thermal fluctuations in standard cosmology were known to be too 
small. 

Observations provided the seeds. The breakthrough came with the COBE satellite in 
1990. This provided the proof that temperature fluctuations are present and monitor the 
existence of large-scale density fluctuations. These had little to do, however, with the 
search for the precursors of galaxies, other than giving confidence that the latter are present 
under the assumption of a scale-invariant density fluctuation spectrum as advocated by 
inflationary cosmology. 

The key theoretical insight indeed preceded the data. Inflationary cosmology was devel- 
oped in the 1980s, and provided for the first time a coherent understanding of the size of 
the universe, its near-Euclidean geometry, and of the origin of the seed density fluctua- 
tions from quantum fluctuations at the Planck epoch. Even inflationary cosmology remains 
incomplete, since there is still no theory of quantum gravity required in order to connect 
Planck-scale physics rigorously with the Einstein—Friedmann—Lemaitre cosmology that 
successfully describes the evolution of the universe. Such a connection is essential in order 
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to understand the nature of the acceleration that couples inflation to the current accelera- 
tion of the universe, with an associated decrease in vacuum energy of some 120 orders of 
magnitude. 

It was to take another decade before the search for the elusive fluctuations that seeded 
the large-scale galaxy distribution were found. Hints emerged with a series of sub-orbital 
experiments, culminating in the detection of fluctuations that probed the horizon angular 
scale at matter-radiation decoupling, demonstrating conclusively that acausal fluctuations 
were present of sufficient strength to seed galaxy formation. Another important conclusion 
was that the geometry of the universe is Euclidean, as effectively measured by the horizon 
angular scale. The immediate consequence was that most of the matter in the universe has 
to be dark and weakly interacting. 

Nor did the mysteries end there. It rapidly became apparent that there was not enough 
dark matter to maintain the flatness of space without leading to other anomalies such 
as excessive peculiar velocities of galaxies. Finally, the acceleration of the universe was 
discovered from the use of type Ia supernovae as standard candles, to provide us with 
another challenge, the dominance of dark energy, capable of inducing recent cosmological 
acceleration via its constant energy density with equation of state p = —pc’. 

Dark energy and dark matter represent the infrastructure of modern cosmology. The 
nature and low value of the dark energy, that is of the cosmological constant, is a strong 
theoretical challenge. Conventional cosmology offers no explanation. The dark matter also 
remains an outstanding mystery. Here the challenge is more observational than theoretical, 
however. Many experiments world-wide are seeking to find evidence for the elusive weakly 
interacting particle that is the favoured dark matter candidate. Should these attempts fail, 
and there is already a full programme for the next decade and beyond, one may need to 
reconsider the dark matter paradigm. 

The building blocks of cosmology are the galaxies. These too contain significant 
unknowns. The detailed theory of galaxy formation remains a mystery. Semi-analytical 
and fully numerical simulations fail to reproduce both local and early universe observa- 
tions of galaxies (Silk et al., 2013; Somerville and Davé, 2014). Star formation is regulated 
by feedback; if not, star formation occurs too rapidly, too efficiently and too early. How- 
ever, as more and more detailed physics is put into ever larger simulations, the final result 
is inevitably that either star formation occurs too early or too late, because feedback is 
either too weak or too strong. Simulations that reproduce the Milky Way galaxy fail disas- 
trously at high redshift. Those tuned to high redshift do poorly at reproducing low redshift 
star-forming galaxies. 

Of course many phenomena can be ‘explained’: these are considered to be the successes 
of the theory of galaxy formation. Phenomena that can be reasonably well understood 
include individual morphologies of galaxies and clusters as well as statistical correlations 
for large samples. The former includes spheroidal and disk galaxy morphologies and angu- 
lar momentum, star formation histories, dwarf galaxy profiles, intracluster gas abundances 
and pressure, and the ubiquity of intergalactic hydrogen clouds. The latter includes environ- 
mental correlations, chemical evolution, cosmic star formation history, the main sequence 


Formation of Galaxies 163 


of star formation, the correlation between star formation rate and gas surface density, the 
massive black-hole spheroid velocity dispersion correlation. 

These are major accomplishments. Moreover, all are verifiable via formulating pre- 
dictions to be made with ever larger telescopes at ever higher redshifts, at epochs when 
galaxies were being assembled. So why the negative attitude? The point is that no sin- 
gle model explains everything. The dynamical range needed is beyond the reach of any 
foreseeable computers at least in the next decade. Approximations to the input physics are 
inevitable. All of the interesting astrophysical output lies in the inevitable deviations from 
any necessarily imperfect canonical model. 


8.2 Understanding Opacity-Limited Fragmentation 


How stars form is a difficult question to answer, since an incredibly broad dynamic range 
is needed. Local observations are useful, but offer no guarantee of compliance under the 
wildly different conditions in the early universe. 

The holy grail of star formation reduces to answering the following questions: 


e Can we predict the masses of stars? 
e Can we predict the efficiency of star formation? 
e Can we predict the rate of star formation? 


Sadly, in any fundamental sense, the answers are negative on all three counts. But phe- 
nomenology is remarkably useful in that it allows us to study these questions in detail in 
the nearby universe. Armed with this experience we can try to tackle the more extreme 
conditions in the distant universe when galaxies were young. 

Let us begin from the beginning, at least as is appropriate for the formation of stars. The 
story begins with Newton who wrote 


if the matter was evenly disposed throughout an infinite space, it could never convene into one mass; 
but some of it would convene into one mass and some into another, so as to make an infinite number 
of great masses, scattered at great distances from one to another throughout all that infinite space. 
And thus might the sun and fixed stars be formed. 


(Isaac Newton, letter to Richard Bentley, 10 December, 1692). 


This is essentially the notion of gravitational instability. Newton abandoned his argument 
however, when he realised that 


If the sun at rest were an opaque body like the planets or the planets lucid bodies like the sun, how he 
alone should be changed into a shining body whilst all they continue opaque, or all they be changed 
into opaque ones whilst he remains unchanged, I do not think explicable by mere natural causes, but 
am forced to ascribe it to the counsel and contrivance of a voluntary Agent. 


It was James Jeans, some two centuries later, who first quantified gravitational instability 
in 1902: 
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We have found that as Newton first conjectured. All celestial bodies originate by a process of frag- 
mentation of nebulae out of chaos, of stars out of nebulae, of planets out of stars and satellites out of 
planets. From the intrinsic evidence of his creation, the Great Architect of the Universe now begins 
to appear as a pure mathematician. 


The Jeans criterion for gravitational stability is based on the concept that pressure gra- 
dients oppose collapse: sound waves must cross regions to communicate pressure changes 
to inhibit collapse. 

Consider a molecular cloud that is unstable to gravitational fragmentation. Under what 
conditions will it naturally fragment into stellar mass clumps? Let me first attempt to make 
an a priori estimate. The goal is to write the typical mass of a star in terms of fundamental 
units. Can one predict the mass of a star from fundamental physics arguments? 

I start with a cold interstellar gas cloud, the typical nursery of star formation. I assume 
the cloud is spherical and uniform. This may seem to adopt what has been denigratingly 
described as the spherical cow approximation, but it suffices for estimates of the relevant 
orders of magnitude of physical quantities. 

One can first derive the collapse time of a collapsing cold gas sphere of mass M where the 
gas density satisfies p(r) = 07 /(22Gr?), and the velocity dispersion o is equivalent to the 
kinetic temperature T and the square of the sound speed vw by writing o* = kT/ Mp = vw, 
as in a self-gravitating isothermal sphere. Some useful scaling relations are radius R = 
Ustdyn = M}/3 9-1/3 = GMv;?, and density p = M/R?> — voM-*G-3, Here fgyy, is the 
free-fall time. 

As the sphere collapses, let me suppose that it cools freely. The cloud stays approx- 
imately isothermal. A cold cloud is subject to Jeans gravitational instability, hence it 
fragments. As the density increases, the Jeans mass decreases. This means that the frag- 
ments subfragment into ever smaller masses. The process stops only when a fragment 
is dense enough to become opaque. It can no longer radiate freely. It is said to be opti- 
cally thick to radiation. The temperature rises and the Jeans scale no longer decreases. 
Fragmentation stops. 

The luminosity of an opaque fragment satisfies 


Lraa = 040 T*R? = 0 4nT'G?M’v,* 


The fragments now accrete surrounding diffuse matter and grow in mass. The increasing 
gravity field of a fragment means that it slowly contracts in order to support its growing 
mass. We call the central region the protostellar core, since it is destined to become hot 
enough to be the core of a star. 

One can now derive the mass limit from opacity-limited fragmentation. The accretion 
rate of the surrounding envelope onto a protostellar core controls the rate of gravitational 
energy release. The gravitational energy release from collapse is 


Ly = GM’R 1), = GVWR *v, = iG 


Formation of Galaxies 165 


Jeans mass 
A 


~ T32 } pi 


Ss 7 


> 
Density 


Figure 8.1 Sketch of Jeans mass versus density., for a series of isothermal clouds of increasing den- 
sity. The minimal Jeans mass sets the opacity limit of fragmentation. This is generically found to be 
about 0.01Mo. 


Hence following the derivation by Rees (1976), one may equate the luminosity L,,q with 
the rate of release of gravitational energy L,, so that o4 7*G*M i = vw G—!. This leads 
to an expression for the typical mass of a fragment, 


M= ve? TG 3 (64m)? = const.T\/4 


Our goal is to express this result in terms of fundamental physics parameters. To 
convert to fundamental units as first done in Low and Lynden-Bell (1976), I use 
o = (2n°/15)k*h~c~? and write T dimensionlessly as kT /m,c? to obtain M/m, = 
aga "(kT /mpc?)'/4. Also, I write T in units of Rydbergs, | Ry = a?m,c?/2, and 
obtain M = ay _ alm, /mp)'/ 4(kT/1Ry)!/4. Note that I have introduced the gravita- 
tional fine-structure constant @g = Gm, /e? = 7.5 x 1073’. This constant represents the 
ratio between gravitational and electromagnetic forces between a pair of protons. It is fun- 
damental to understanding what forces determine the mass of a star. Any fundamental 
stellar mass, including the Chandrasekhar mass or the Eddington mass, can be expressed 
in terms of ag */ in combination with the fine-structure constant. 

Attainable temperatures are 10K in nearby molecular cloud cores and 1000K in primor- 
dial clouds where there are only trace amounts of Hz as coolants. The sound speed and 
accretion rate vary considerably over this temperature range. However one always finds 
that the minimum mass limited by opacity is M ~ 0.01Mo. This is true for primordial star 
formation with cooling via H2 (Palla et al., 1983) as well as via dust (Omukai et al., 2005). 
In other words, the minimum mass of a fragment is limited to be around 0.01Mo, or well 
below the mass of even the smallest star. 

In the primordial case, the parent clouds are typically in the mass range 10° — 107Mo 
and have trace amounts of molecular hydrogen. Radiative trapping raises the minimum 
fragment mass in the case of more massive primordial clouds that are purely atomic and 
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cool by Lyman alpha emission (Latif ef al., 2011). But it still stays small and remarkably 
close to the opacity-limited fragmentation value. 

Fundamental theory applied to a diffuse interstellar cloud that is collapsing under self- 
gravity yields a minimum and small fragment mass. This is a robust but wrong result in 
terms of star formation! Such a result at first sight seems irrelevant to stars. In fact, it 
provides the building blocks of stars. The derived scale is central to star formation theory, 
and vindicated by numerical simulations. The point is that additional physics is needed, as 
we see below, to build on these fragments. The missing physics is that of fragment growth 
by accretion. 


8.3. The First Stars 


To build up the stellar mass function, from fragments to stars, the crucial ingredient is 
supplementing fragmentation by continuing accretion of cold gas. To avoid making only 
massive stars, however, one has to limit accretion. This is achieved by feedback that taps 
stellar energy via magnetic turbulence and outflows for low mass stars and ionisation fronts 
(HII regions) and winds for massive stars. The accretion rate ce /G « T*/*/G for a tem- 
perature of around 10K today for nearby star-forming cold clouds is ~ 10~°Mo/year. 
However long ago, in the absence of heavy elements, the sound speed was high, up to 
~ 10km/s at T ~ 10*K and the accretion rate exceeded ~ 10-7>Mo/year. This means that 
the first stars were relatively massive. In fact there is feedback from the forming massive 
stars, due to ionisation fronts and outflows. The net result is that there is a wide range of 
stellar masses with a relatively flat mass function, for the first generation of stars (Hirano 
et al., 2015). 

If the central accretion rate is sufficiently high, there is little time for the infalling gas 
to fragment into stars before a central star forms that rapidly undergoes direct collapse 
into a black hole (Inayoshi and Haiman, 2014). There is no time in this case for radiative 
feedback from the central object to halt the collapse. 


8.4 The Typical Mass of a Star 


However Jeans did not address the question posed by Newton as to what distinguishes 
planets from stars. The physics that differentiates stars from planets was first laid down 
by Arthur Eddington. He was the first to understand Newton’s dilemma, as to how gravi- 
tational instability and fragmentation could distinguish opaque objects that we now know 
shine by reflected sunlight (planets) from intrinsically luminous objects (stars). Eddington 
wrote 


We can imagine a physicist on a cloud-bound planet who has never heard tell of the stars. (She) 
calculates the ratio of radiation pressure to gas pressure for a series of globes of gas of various sizes, 
starting, say, with a globe of mass 10 gm, then 102 gm, 10° gm, and so on, so that (her) nth globe 
contains 10”gm. Regarded as a tussle between gas pressure and radiation pressure, the contest is 
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overwhelmingly one sided except between Nos 33 to 35, where we may expect something interesting 
to happen. What happens is the stars. We draw aside the veil of clouds beneath which our physicist 
has been working and let (her) look up at the sky. There she will find a thousand million globes of 
gas nearly all between 1.2 and 59 times the sun’s mass. 


(Sir Arthur Eddington, The Internal Constitution of the Stars, 1926). 


The step from planets to stars involves the force of gravity requiring enough heat under 
the enormous central pressure that the object shines. Indeed the critical step is that it main- 
tains a source of central heat by thermonuclear reactions, otherwise stars would be too 
short-lived. Remarkably, when Eddington deduced the main sequence range of stars he did 
not know about the burning of hydrogen to helium as the central energy supply for sun-like 
stars. This only came decades later with the work of nuclear physicist Hans Bethe. 

To understand Eddington’s conclusions better, one can cast the minimum mass of a star 
in terms of fundamental units. I review and reassess the arguments of Krumholz (2011), 
in which he re-evaluates the gravitational fragmentation mass for a protostar by using 
accretion to set the protostellar luminosity, and the onset of deuterium burning to set the 
central temperature. He argues that approximately half of the Bonnor—Ebert mass forms 


the protostar, namely 
ite? 
M,, © 0.6 / (7) << 
Mip>MyG/ pe 


Here T, is the gas temperature at the edge of the gas accreting to form the star, and peg is 


the local gas density, averaged over the collapsing region. He fixes T, from the protostellar 
luminosity assuming dust opacity dominates, and obtains the luminosity from the energy 
released by accretion onto the central protostar. The scalings are L « TM, « MT. ie 
and the central temperature 7, is fixed by deuterium burning to be constant (in terms of 
fundamental constants, it is proportional to the Gamow energy, Eg, for the D burning 
reaction, itself « a 
and, for the case of fiducial Milky Way-type dust cooling in an = 3/2 polytrope, scales as 
(XL/M)'/*. The interstellar pressure enters via the assumption that surface density (more 


myc’). The temperature 7, is determined by the central luminosity 


physically, dust opacity) is constant, and consequently pism « 7. 

The resultant scaling for stellar mass is M, « ag Pq 4*3p-VB@ TA? where ©, = 
(Ec/ AkT,.)'/3 ~ 10 and Ty © 10°K at the onset of the deuterium burning that determines 
the onset of the formation of the star. Krumholz in fact normalises the interstellar pressure 
to the Planck pressure c’/AG* and thereby introduces what is a misleading dependence 
of the characteristic stellar mass on the gravitational fine structure parameter, Gm, fe’, 


obtaining 
1/18 
wie\ p \7H8 
M, = Axmy 35 > 
AG PPlanck 


where Ax is a dimensionless constant. 


In fact, it is more appropriate to normalise the temperature to natural atomic units, 


namely to a Rydberg kT,yqg = a*m,c?/2. After all, quantum gravity has little relevance 
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for star formation. Only atomic or molecular processes, specifically Lyman alpha emission 
or molecular H rotational excitations, are effective for primordial star formation, with 
T ~ 1000 K, and of course in the present epoch interstellar medium, star formation occurs 
at T ~ 10 K. The pressure at the boundary of the protostar can be written to within dimen- 
sionless factors of order unity as p = pkT/m, = vs /G°M°. For purposes of normalisation, 
Tuse a fiducial value v; = ac, equivalent to the Rydberg energy kTpyq of an electron. Now 
rewriting the expression for the characteristic mass, I find that 


4/9 [8&4 
kT com 
M, = Asmya,?!a~*° SS ; e 

MyC e°p 


where As is a dimensionless factor of order unity. After inserting the previous expression 
for the pressure in terms of vs, this reduces to 


1/4 7 ep. \ 12 pp 1/4 
= 3/22 { ™p c Ryd 
My, = Asmpy0, a (“2) (5) (= 


Krumholz’s conclusion, as corrected here, is essentially unchanged: the only dependence 
of M,, on astrophysical parameters is via the interstellar pressure, and the dependence is 
exceedingly weak. There is still an explicit dependence on the deuterium burning temper- 
ature. However, there is no longer any scaling with ambient pressure, rather with ambient 
temperature. Note that the temperature scaling is the inverse of that found for opacity- 
limited fragmentation. The scaling in fundamental units gives a mass scale of around a 
solar mass. This is the typical mass of a star in today’s universe. 

In the early universe, conditions were very different. There were no heavy elements, 
since the stars that produce and disperse them had not yet formed. This meant that in the 
absence of the cooling transitions from low-lying fine-structure states of atoms and ions 
such as carbon and silicon, the only coolant was hydrogen. Hence the temperature was 
relatively high, thousands of degrees Kelvin rather than tens of degrees Kelvin. 

Since the sound speed was high, accretion rates were high. Massive stars predominated. 
Trace amounts of molecular hydrogen allowed temperatures of order 1000K. This was 
optimal for massive stars to form. However if the molecular hydrogen, a relatively frag- 
ile component of the gas, was destroyed, cooling by atomic hydrogen only occurred at a 
temperature of order 10,000K. Indeed if the temperature was this high, as expected from 
atomic hydrogen cooling, the enhanced accretion rate ~ v3 /G, led to the optimal situation 
for black hole formation. In such a situation, ultraviolet radiation, produced by the first 
massive stars, maintains the suppression of any trace amounts of molecular hydrogen and 
cooling to lower temperatures cannot occur. One needs some cooling of course, in order 
for the gas to collapse to high density under the action of self-gravity, and this is supplied 
by Lyman alpha emission. The resulting huge accretion rate quenches any fragmentation, 
and the outcome is likely to be direct collapse to a black hole, and possibly a massive black 
hole. 
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While such objects formed long ago, there are various ways one can probe their exis- 
tence. For example the short-lived massive stars of the now extinct Population III leave 
nucleosynthetic tracers in the atmospheres of the oldest low mass members of Population 
II. These stars can be seen in the halo by virtue of their low metallicities, rare stars being 
found with an iron content lower than that of the Sun by 10° or more. Another character- 
istic is their nebular emission: intense lines of Hell are predicted with no accompanying 
emission by heavier ions such as [OII]. Tentative evidence for such objects has recently 
been reported (Heap et al., 2015; Sobral et al., 2015). The first black holes could provide 
the required seeds for the growth of supermassive black holes seen as quasars in the early 
universe. The need for such seeds is inferred from the high black hole masses, in excess 
of ~ 10°Mpo, inferred at very high redshift when there is inadequate time for Eddington- 
limited growth. An alternative solution may appeal to a duty cycle with many intense but 
short-lived periods of super-Eddington growth. 

Population III stars and black hole seeds may also contribute to the ionising photon bud- 
get responsible for reionisation of the universe. While dwarf galaxies are often considered 
to be the dominant contributor (Robertson et al., 2015), considerable doubt exists as to 
whether the escape fraction of ionising photons is high enough, according to simulations 
that include gas infall via cosmological accretion (Ma et al., 2015). Nor is it likely from 
diffuse X-ray background constraints that black hole seeds, acting as mini-quasars, can 
account for more than ~ 10 per cent of the ionising photon budget (Haardt and Salvaterra, 
2015). 

Perhaps the ultimate solution is the combination of mini-quasars, essentially generated 
by accretion onto black hole seeds of mass 10° — 10°Mo, that drive cavities in the accreting 
gas, combined with bursts of star formation from the associated dwarf galaxies. 

Indeed there is increasing evidence that dwarfs contain central massive black holes. The 
AGN fraction may be as large as ~ 10 per cent for the lowest mass nearby infrared-detected 
dwarfs (Satyapal et al., 2014) and ~ 1 per cent if optically selected (Moran et al., 2014) 
with active AGN found for black holes in the mass range 10*—10°Mo, infrared excesses 
selecting the lowest mass MBH. X-ray selected dwarfs are rarer, with central massive BH 
candidates amounting to ~ 0.1—1 per cent (Lemons et al., 2015; Pardo et al., 2016). 


8.5 Modelling the Galaxy Luminosity Function 


The initial condition problem plagued all early studies of galaxy formation: pioneers 
included Lemaitre, Gamow, Harrison, Peebles, Zel’dovich, Ozernoy and others. This was 
first solved schematically with COBE in 1992, in the sense that ultra-large large-scale 
fluctuations were detected that had no direct match in observed features in the galaxy 
distribution, but were nevertheless a compelling indicator of the presence of the nearly 
scale-invariant fluctuations needed for large-scale structure formation. The definitive all- 
sky detection of the elusive fossil seeds awaited the WMAP/PLANCK satellites some two 
decades later, when the precursors of galaxy clusters were directly mapped as infinitesimal 
cosmic microwave background radiation fluctuations. 
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The linear theory of density fluctuation growth in the expanding universe is well justified 
by virtue of the measured temperature fluctuations in the CMB. These provide the elusive 
initial conditions for structure formation, with only a modest extrapolation required from 
cluster down to galaxy scales. Scale invariance and Gaussianity are usually invoked to 
make this last step unique. These are effectively measured in the CMB fluctuations over 
a broad range of scales. The fully non-linear endpoint is the mass function of galaxies, 
calculated as the assembly of newly non-linear objects in the rare peaks of the Gaussian tail 
that move above a certain threshold as the fluctuations grow in amplitude by gravitational 
instability. 

It is notoriously difficult to transform the mass function of galaxies, understood from the- 
ory, into the observed luminosity or stellar mass function. This is because of the complexity 
of baryonic physics. The subdominant baryons, initially tracing the cold dark matter, dissi- 
pate energy via atomic cooling and cool to form dense self-gravitating clouds that fragment 
into stars. However, neither the efficiency of star formation nor the initial mass function 
of the stars nor the rate of star formation can be predicted in any fundamental way. So 
we cannot easily predict the stellar mass function of galaxies if we fail at the first hurdle, 
namely that of forming the stars. 

What we do understand is the mass function of dark matter halos. Moreover while the 
ratio of dark matter to gravitationally bound baryons is known, the mass fraction in stars 
is unknown, a priori. So one simple starting point is to renormalise the dark matter mass 
function to fit the observed galaxy luminosity function. We can then calculate the ratio of 
dark to luminous matter required for the two functions to overlap. In fact they overlap at 
only one point. Fortunately this point is well defined, as the mean galaxy luminosity and 
the mean dark mass, both mass and luminosity functions being fully convergent. What we 
infer is M,./L,., where L,. is the Schechter luminosity of about 3 x 10!°Mo, and M,, is the 
required dark mass corresponding to the halo of a galaxy of luminosity L,. in order for the 
mass and luminosity functions to overlap at their mean values. We can measure this ratio 
observationally: it is approximately 30 in solar units Mo and Lo. 

Remarkably, we can calculate M,. from first principles. A necessary condition for star 
formation in a protogalaxy is gas cooling. This does not guarantee star formation unless 
self-gravity also plays a role. It is instructive to compare the two key timescales, the free- 
fall time with the cooling time for an isothermal sphere. In this way one can estimate the 
maximum mass for a galaxy, or more realistically a characteristic mass (Binney, 1977; 
Rees and Ostriker, 1977; Silk, 1977). 

In order to describe how the precursor cloud destined to form a galaxy can contract under 
gravity I need to introduce the atomic cooling function for a primordial plasma of hydrogen 
and helium, written as Ay = aorc’ue 'Ey (ve /Vre ae where ve is the circular velocity 
of a test particle in the galaxy at the half-mass radius, U;e¢ = ac (me /™mp)'! * and Ey = 
a*c*me. Here f is a function of temperature, where 8 = 0.5 describes bremsstrahlung 
(e.g. braking radiation) and 6 = —0.5 approximates bound-free cooling of hydrogen. This 
is a very simplified formulation and more accurate cooling rates are given by Sutherland 
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Figure 8.2 Comparison of observed galaxy luminosity function and predicted galaxy mass function 
in a cold dark matter-dominated universe with initial conditions set by generic inflationary cosmol- 
ogy. The mass function has been renormalised to match the luminosity function at the mean mass 
(and luminosity) of a galaxy. The inferred ratio of cooled baryonic (stellar) mass to luminosity is 
approximately 30, as expected for an old stellar population. 


and Dopita (1993), with an update including on-line tables by Gnat and Sternberg (2007). 
In fact, for a hydrogen-helium plasma at 10° — 10’K, the relevant temperature for typical 
galaxy halos, we can bypass these details and use 8 ~ —0.5 as a reasonable approximation. 
I set t¢ = 3kT/(2Aqn) and tg = GM,/ (2v3). Here My is the dynamical mass. With 
a slightly more general parametrisation of the cooling function, the characteristic galaxy 


mass is now 
a (Mp i Us\2 te 
Mz = 2 ( ) ( ) 
a Me Cc tg 
Finally let us require the cooling time to be less than the dynamical time as a necessary 
(if not sufficient) condition for galaxy formation. The magic mass, the typical mass of 
a galaxy, is ~3 x 10!'Mo, and is obtained from simple fundamental physics reasoning, 
namely that a necessary condition to form a galaxy is that a protogalactic cloud cool within 
a gravitational collapse time. Of course this may not be a sufficient condition, but exploring 
this caveat would take us into the complex and messy area of baryonic feedback. For an 
old stellar population, this stellar mass gives the mean galaxy luminosity, or the Schechter 
luminosity, in visible light of ~3 x 10!°L. This is true for H plus He cooling, but remains 
more or less the case even when metal cooling is included. 
In fact, gas accretion continues over a Hubble time unless feedback and/or environ- 
mental effects intervene. To explore this regime, I replace the galaxy dynamical time 
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in the previous derivation by the age of the universe, i.e. set gas cooling time equal 
to Hubble time. This provides the maximum mass of a galaxy. The additional factor is 
ty/ta = (122 Gp/H5)'/? = (3 pg /p(z))'/? x 25. This would result in an excess of the 
most massive galaxies. Feedback must intervene, and this is generally accepted to be via 
quasar (supermassive black hole outflow)-driven heating of the intergalactic medium. 


8.6 Massive Black Holes and Quasars 


Quasars are powered by accretion onto supermassive black holes and are detected about 
as far back in time as we can observe galaxies, that is to a redshift of 7 or 8. In fact, the 
first galaxies are in place somewhat earlier, but growth both for galaxies and quasars peaks 
much later, at z ~ 2. We measure the growth of the stellar component of galaxies by the star 
formation rate, and the accretion rate onto black holes by the X-ray luminosity of quasars. 

We observe quasars powered by black holes of mass up to ~ 10!°Mo at high redshift, 
but we do not understand how such massive black holes grow so rapidly in the early uni- 
verse. The presence of even a few billion solar mass MBHs at high redshift challenges 
conventional Eddington-limited growth models for MBHs. 

Assuming Eddington-limited accretion, a black hole mass grows as 


M = Moexp {[( — €)/e€] (t/ts)}, 


where the Salpeter timescale t; = €or c/(4% Gimp) = 0.45 € Gyr, and € is the radiative 
efficiency, normally related to black hole spin in radiatively efficient accretion disks, rang- 
ing from € = 0.057 to € = 0.32 for spin parameters ranging from 0 to 0.998. Given the age 
of the Universe at z = 6 — 7 and the estimated MBH masses, > 10°Mo, even with quasi- 
continuous Eddington-limited accretion, we would require seeds of 10? < Mg < 10°Mo. 
Such seeds are not easily found. 

The causal connection between massive black hole growth and associated quasar activity 
and galaxy growth by star formation is unknown, although co-evolution of massive black 
holes and galaxies certainly occurs as evidenced by the similar cosmic history of star for- 
mation rate and black hole accretion rate in Madau and Dickinson (2014). The evidence for 
co-evolution, and most likely self-regulation, comes from the interpretation of the empiri- 
cal correlation between black hole mass and spheroid mass, more specifically the velocity 
dispersion of spheroid stars which is effectively a proxy for the mass. 

To examine co-evolution, and in particular the competition between accretion-driven 
black hole growth and black hole-powered outflows, I begin with the Eddington limit 


Mp 
Lraa = 4CGMgn —, 
OT 
and set the corresponding mechanical energy flux 5Meaav2, = nLeda/2, where n is the 
radiative efficiency of the black hole, L/ (Mpuc?). Note that v, = ne, if L = Lega, as 
observed in quasar broad emission line regions for 7 ~ 0.1, as expected for canonical 
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values of black hole spin. Now balance the outward energy with that of the gravitational 
energy flux of bound and accreting gas, ~ 16 For? /G, ina marginally escaping shell, where 
o is the gas velocity dispersion and the escape velocity for an isothermal distribution is 
taken to be 20. 

If the black hole is massive enough that the Eddington energy release exceeds that of the 
infalling gas, the gas reservoir is ejected, star formation is quenched, and the black hole 
stops growing. In the case of energy feedback, as would apply for a quasar wind, I find that 
(Silk and Rees, 1998) 

ao> or 
Mba = VS eG? Mp’ 
where a ~ 8. This would correspond to very strong feedback, that however neglects energy 
loss by cooling. 

Let us consider momentum feedback, as would apply for a radiatively driven outflow. 
The radiative momentum flux is L/c. This is mostly trapped if photons undergo single 
scatterings. I compare this with the required momentum flux to eject the gas that has not 
yet formed stars. The momentum flux of the gas-rich protospheroid, if in a shell at radius r 
ejected at velocity o, is 4 fa /G. Identify L with the Eddington luminosity and one has 


ue o* or 

BH = fe i 

Momentum feedback fails by a factor ~ 10 to give the required normalisation of the 
Mpbu — ¢ relation, whereas energy feedback works for a momentum boost factor inferred 
from energy conservation (Egurfiow & 1/Usnock) provided that the thermalisation of the 
kinetic energy of the central ultrafast outflows is effective over scales from that of the 
subparsec accretion disk surrounding the central SMBH, where the outflow originates, to 
galactic scales. Observational evidence supports this connection (Feruglio et al., 2015; 
Tombesi et al., 2015). 


8.7 Feedback Can Be Positive 


The fundamental recipe for star formation, involving a density threshold, has had a certain 
amount of success. Sub-grid formulations of star formation are essential, given the broad 
dynamic range needed to tackle cosmological volumes. The Kennicutt—Schmidt law is 
treated as the gold standard for testing sub-grid models of star formation in cosmological 
simulations, as in Hopkins ef al. (2014). The observations can be matched for a wide range 
of galaxy masses and surface densities. Control of the gas supply plays a crucial role, since 
the gas reservoir, accreted or in situ, controls star formation. 

It is generally argued that quenching of star formation is needed for massive galaxies. 
Many of these are red and dead, in the sense that there is little ongoing star formation. Yet 
the ongoing accretion of cold dark matter and accompanying gas that seems inevitable in 
typical environments should result in star formation. 
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There are three basic ingredients to understanding the rate of galaxy formation: gas 
accretion, gas outflow and star formation. These are likely to be interconnected and even 
self-regulated. One resolution is to argue that jets from radio galaxies heat up the circum- 
galactic medium sufficiently so that gas infall and cooling are inhibited. On smaller scales 
and at earlier times, vigorous winds from quasars eject large amounts of gas beyond the 
escape velocity from the galaxy. Ejection is possible for sufficiently massive black holes. 
Hence black hole growth plays an important role. Star formation is saturated once the gas 
supply is inhibited. This generates the correlation between black hole and spheroid mass 
and also, if outflow rate balances accretion rate approximately, accounts for the roughly 
50 per cent shortfall of baryons in massive galaxies. 

Yet all is not well. For example, state-of-the-art cosmological simulations fail to repro- 
duce enough starbursts (Sparre et al., 2015). It is argued that elevated density thresholds 
and turbulence suppress star formation, as observed in some giant molecular clouds (Rath- 
borne ef al., 2014). Unfortunately, in more extreme situations of density and turbulence, 
star formation is actually enhanced, as for GMCs undergoing nuclear starbursts (Leroy 
et al., 2015). 

One solution may be to argue that feedback is not necessarily always negative. Espe- 
cially if the massive black holes form first, they can initially trigger star formation. Here is 
how this might work. 

A vigorous outflow is normally disruptive with respect to star formation. The gas 
reservoir for star formation is disrupted. Cold gas is entrained in the outflow and 
ejected. Molecular clouds are disrupted by Kelvin—Helmholtz instabilities and evapora- 
tion. These effects are seen in simulations. On the observational side, outflows of cold 
gas that are comparable to or exceed the star formation rate are frequently detected, and 
attributed to the impact of the AGN wind and/or jet. All of this amounts to negative 
feedback. 

However, if molecular clouds are sufficiently massive, they will be compressed rather 
than disrupted as the outflow sweeps around them. Compression leads to collapse and star 
formation. Moreover since the outflows are observed at thousands of kilometres per second, 
one has the potential of stimulating more coherent and intense star formation than would 
be achievable by instability of a self-gravitating gas-rich galactic disk. The enhancement 
in star formation rate should be of the order put fiow /Ucire, WHETE Vour flow 18 the outflow 
velocity and U¢irc is the rotation velocity. This effect could provide the needed boost in star 
formation. 

The usual argument appeals to galaxy mergers, in order to explain the ultra-luminous 
starbursts. However, mergers only can augment the gas reservoir available for star forma- 
tion. Another ingredient is needed to account for enhanced conversion of gas into stars. 
What remains to be determined is whether this additional physics input is no more than 
the merger-driven enhanced gas turbulence or might involve the output from the inevitably 
present AGN, itself undoubtedly provoked by enhanced gas feeding as a consequence of 
the merger. 
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8.8 Outstanding Questions 


We do not yet have a complete theory of galaxy formation. There are more questions 
than answers. Observations are ahead of theory, which is hard pressed to confront the 
latest observational discoveries. I conclude with a list of outstanding questions in galaxy 
formation theory. 


e Are stars inevitable? 

This is essential from the fundamental physics perspective. We would like to know, for 
example, given the six parameters that fit the initial conditions of cosmology and the 
large-scale structure of the universe, as demonstrated by the Planck satellite, whether 
stars inevitably form. This is the essence of anthropic arguments that seek to explain the 
cosmological parameters. 

e Does nearby star formation provide a robust template for first or even second gen- 

eration star formation? 
Conditions were very different in the early universe. The systems were gas-rich, there 
was more turbulence, and the assembled halo and baryonic masses were lower. We have 
the advantage of exquisite angular resolution in developing a theory of local star forma- 
tion. Even here we are still far from the ultimate theory. Theorists commonly apply the 
local rules to modelling the distant, more violent universe. 

e Are galaxies inevitable? 

Galaxies are where stars form. There is a characteristic mass for a galaxy; both the dwarfs 
and the giants contribute relatively little, or at least are subdominant, in terms of the 
stellar mass budget. But the smallest galaxies provide clues on how galaxies formed 
hierarchically. Their number is important, for it controls the ionisation of the universe. 

e Does the physics that controls the global evolution of galaxies, including mass 

assembly, star formation rates and gas accretion in nearby galaxies, apply in the 
early universe? 
We observe exceptionally large star formation rates in the early universe, and rapid 
timescales for star formation. Is it more of the same at early epochs? i.e. just like the 
Milky Way but more gas. Massive outflows are observed, driven by active galactic nuclei. 
Is this all there is to quenching of star formation? Or is something radically new needed 
that involves positive as well as negative feedback from supermassive black holes? 

e Do the monsters in the middle (supermassive black holes) form by gas accretion or 

by swallowing stars and smaller massive black holes? 
There is barely time available to form supermassive black holes in the early universe. 
One needs monster seeds, themselves massive black holes of 10° — 10*Mo. How these 
seeds form is vigorously debated, with direct collapse being a favoured but by no means 
unique option. 

e Active galactic nuclei: aftermath, precursor or co-eval to star formation? 

We cannot be sure of the nature of the first objects in the universe. Most would bet on 
galaxies, but black holes might be an option. The most distant quasar is at a redshift of 
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about 7, whereas galaxies are detected to a redshift as large as 10. However, these first 
galaxies most likely contain supermassive black holes. 

Will computer simulations eventually resolve the problem of galaxy formation? 
State-of-the-art numerical simulations of galaxy formation provide impressive images 
that to the observer’s eye are indistinguishable from the genuine article. The difficulty is 
that the dynamic range required to span star formation and massive black hole accretion 
in a cosmological volume is well beyond current capabilities. The theorist has to add ad 
hoc subgrid physics as a black box to galactic or supergalactic scale simulations. And 
even the resolved physics is subject to assumptions about the initial conditions or even 
the theory of gravity that can profoundly modify the outcome. The resulting images are 
beautiful but incomplete, and certainly cannot be used for robust predictions of what to 
look for next. Nevertheless they are useful indicators, and simulations have proved to 
be essential for constructing mock catalogues as a tool for implementing and analysing 
large-scale structure surveys. There is no doubt that simulations will eventually rise to 
the task of forming galaxies, but it promises to be a long wait. 
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Foundations of Cosmology: Gravity and the Quantum 


9 
The Observer Strikes Back 


JAMES HARTLE AND THOMAS HERTOG 


9.1 Introduction 


The context for this chapter is our universe — the whole closed system at all times con- 
taining all the galaxies, stars, planets, biota, human societies, you, us, etc. There is nothing 
outside. Two kinds of description of the universe can be distinguished: 

Third person descriptions: Descriptions of what the universe contains and how that 
evolves — histories of what occurs. 

First person descriptions: Descriptions of what we as the collection of human scientists 
observe of the universe and use to test cosmological models. 
The connection between these two kinds of description is the subject of this chapter. 

Quantum mechanical theories, and also classical ones, provide probabilities for the dif- 
ferent descriptions. Correspondingly we can distinguish two different kinds of probabilities 
for any observable O. Third person probabilities! 


pO) (9.1a) 

for what values of O occur, and first person probabilities 
GP) © 9.1b 
pO) (9.1b) 


for what values of O we observe. The connection between these two kinds of probabilities 
is the focus of this chapter. 

We will consider theories whose direct outputs are third person probabilities for histories 
of what occurs in the universe. These include probabilities for the existence, evolution, 
and functioning of any observing subsystems such as ourselves. Such subsystems play 
no special role in formulating the theory — they are just one kind of subsystem among 
many. We refer to such theories as “third person” theories. Our most successful theories 
of cosmology are of this kind. Classical physics in general is an example of a third person 
theory. In quantum mechanics the extensions of the ideas of Everett [2, 32] are third person 
theories, including those used in this chapter. 


! Following Hawking and Hertog [1], in most of our previous work we have used “bottom-up probabilities” and 
“top-down probabilities” for what are called here “third-person probabilities” and “first person probabilities”. 
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Observers do play a preferred role in calculating first person probabilities for observa- 
tions from third person ones. First person probabilities are third person ones conditioned 
on a description of the observational situation. We shall use a variety of specific models to 
infer the following general conclusions: 


e What is most probable to occur is not necessarily what is most probable to be observed. 

e Anthropic selection is an automatic consequence of first person probabilities. 

e In universes large enough that we may be duplicated as physical subsystems elsewhere, 
the description of the observational situation needed to compute first person probabilities 
must also specify whether our particular situation is typical of all the others. It is the 
combination of the model of the observer, including this typicality assumption, and the 
third person theory which is tested by observation. 


Observers and their observations are of central importance in the formulation of Copen- 
hagen quantum mechanics. In the quantum mechanics of closed systems observers might 
seem to have been demoted to the status of one subsystem among many. Indeed, they 
have little effect on third person probabilities. But, as a consequence of the conclusions 
above, the observer returns to importance in the calculation of first person probabilities for 
observations by which the theory is used and tested. The observer strikes back. 

Section 9.2 sketches the framework of the third person theory we will employ. 
Section 9.3 discusses issues involved with first person probabilities. Section 9.4 uses a 
model universe to make more concrete the notions of first and third person descriptions 
of the universe and their associated probabilities. Here we also describe the connection 
between third and first person probabilities in a set of simple models. Section 9.5 describes 
how anthropic selection emerges automatically as a feature of a certain class of first person 
probabilities. Section 9.6 shows how first person probabilities can sometimes be calculated 
directly from the theory with an appropriate coarse graining. Section 9.7 discusses the first 
person predictions of a number of cosmological observables, such as the cosmological con- 
stant, in cosmological models based on the no-boundary quantum state of the universe and 
a dynamical landscape theory in which these observables can take a range of values. In 
Section 9.9 we try to set our results in a more general view of physical theories. 


9.2 Third Person Quantum Cosmology 


This section briefly describes the elements of the theory that predicts third person probabil- 
ities for histories of the universe which we use to reach the general conclusions mentioned 
in the introduction.” 


2 For more details the reader can consult the authors’ papers on which this chapter is implicitly based and through 
them find further references. For the quantum mechanics of closed systems see e.g. Hartle [3, 4]. For quantum 
cosmology see e.g. Hawking and Hertog [1]; Hartle et al. [5, 9]. There is a little more detail about the no- 
boundary quantum state of the universe [10] in Appendix 9.2. 
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We view the universe as a closed quantum mechanical system. It contains everything. 
Galaxies, stars, planets, their biota, observers (including us!) etc. are physical subsys- 
tems of the universe subject to its quantum mechanical laws consistent with an Everettian 
point of view [2]. The basic variables describing the universe and its contents are four- 
dimensional cosmological spacetime geometries and four-dimensional configurations of 
matter fields. The basic ingredients of the theory are an action J describing the dynamics 
of geometry coupled to matter fields, and a quantum state of the universe Y. We denote the 
theory as (7, V). 

The theory (J, VY) predicts third person probabilities for the individual members of sets 
of alternative four-dimensional histories of the universe, including those histories that 
describe its classical evolution. In this way (J, YW) can supply third person probabilities for 
such large scale features of the universe as the amount of inflation, the approximate homo- 
geneity and isotropy, the pattern of cosmic microwave background (CMB) variations, and 
the formation and evolution of the distribution of galaxies. In principle (7, Y) also sup- 
plies third person probabilities for the accidents of biological evolution, the existence of 
observers like ourselves, etc. that are well beyond our power to compute or even estimate. 
For the examples in this chapter we will mostly use histories in which we are at a single 
moment of time: the time approximately 14 Gyr after the big bang when our observations 
of the universe are made. The theory (/, Y) is an example of what we will call a third 
person theory. 

We test and utilize a theory not by its third person probabilities for what occurs, but by 
its first person probabilities for what we observe. To compute first person probabilities we 
first need to model the observational situation, which we do next. 


9.3 First Person Quantum Cosmology 


We begin by recalling the definition of a Hubble volume. We cannot see further in the 
universe than the distance that light travels to us from the big bang, roughly 14 Gyr ago. 
A present volume of the same size as this distance is called a “Hubble volume”. In order 
of magnitude the distance is c/Hp where Ho is the present Hubble constant. This size is 
approximately 4000 Megaparsec or 107 km. This is the largest scale over which we can 
currently observe the universe. 

As observers we are physical systems within the universe with only a probability to have 
evolved in any one Hubble volume and a probability to be replicated in other Hubble vol- 
umes if the universe has a very large number of them. An observer is a very special kind of 
fluctuation Do in the universe. It is a fluctuation that is not singled out by quantum theory 
from, say, density fluctuations that produced the CMB. But the probability for observers is 
very difficult to compute. We will therefore employ a highly simplified model of observers: 
All observers are alike (copies of us) and either exist in any Hubble volume with a third 
person probability pg(Do) or do not exist with a probability | — pg. Realistically the prob- 
ability pg incorporates the probability of the accidents of several billion years of biological 
evolution. Therefore, whatever its value is, it is very, very, very small. This is a very crude 
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model of complex observers, but better than many treatments where the probability that 
observing systems evolved as part of the universe’s evolution is not considered at all. 

Since both we and what we observe are part of the universe, first person probabilities 
can be computed from the third person ones. If we are unique as physical systems within 
the universe the first person probabilities are simply third person ones for what is observed 
conditioned on a description of the observational situation like the one given above, in 
terms of probabilities pg for data Do. But if we are not unique then a more careful specifi- 
cation of the observer is required which includes an assumption about which instance of Do 
is us, or more generally a probability distribution on the set of copies called a xerographic 
distribution [11]. In this chapter we will make the minimal assumption that we are equally 
likely to be any of the incidences of Do that the third person theory (/, Y) predicts. The 
way to make other assumptions is discussed in Appendix 9.1. 

It is a common intuition that the presence or absence of observers is unimportant for 
the behavior of the universe on cosmological scales because observers are generally small 
subsystems. That intuition is correct for third person probabilities, but it is not correct for 
first person ones. As physical systems we have only a tiny probability to exist in any one 
Hubble volume. That is why third person probabilities are little affected by the presence 
or absence of us. But this also means that we have a greater probability to live in a larger 
universe than a smaller one, because in the larger there are more Hubble volumes in which 
to be. Therefore, even if the third person probabilities favor smaller universes the first 
person ones may favor larger ones. As we will show by example, what is most probable to 
occur (third person) is not necessarily what is most probable to be observed (first person). 
It is in this way that the observer returns to importance in cosmology. 


9.4 Toy Model Universes 


This section derives the connection between third and first person probabilities for a very 
simple class of models that provide elementary examples showing what is most probable to 
be observed is not necessarily what is most probable to occur. In the trade these are called 
“box models” [12]. 


9.4.1 Third Person Description and Probabilities 


Histories of the universe at one time are modeled as a collection of N boxes representing 
Hubble volumes at that time. Each box has a color — say white or gray — modeling different 
CMB maps. Each box may or may not contain an observer. A third person description 
of a history specifies the number of boxes, their color, and whether each box contains 
an observer or does not. A third person quantum theory specifies probabilities for these 
histories — a probability for the number of boxes, a probability for the color of each box, 
and the probability pz for whether an observer exists in each. 

We illustrate this with the very simple set of single time histories [12] shown in 
Figure 9.1. At one time, only two possible histories of N and color are possible — Nj boxes 
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RED | RED | RED | RED | RED | RED 


BLU | BLU | BLU | BLU 


Figure 9.1 Two histories of the simple box model universe described in the text. The boxes model 
Hubble volumes. Their color models an observable like the CMB. An “E” means that an observer 
is in the box observing its color. A blank means there is no observer in the box. The third person 
probabilities for these histories to occur are at the right. (BLU = Blue color; see arXiv:1503.07205 
for color figures). 


all red that occurs with a third person probability p;, and N> boxes all blue that occurs with 
third person probability py = | — p;. The probability that there is an observer in any box 
is pg — the same for all boxes. The probability that there is no observer in a box is 1 — pz. 
The third person probability of a history with a specific set of ng of Nx boxes occupied by 
observers is 

pep (1 — pry (9.2) 


where k = | (all red) or k = 2 (all blue). The complete set of histories consists of all red 
and all blue histories, with different numbers of boxes in each, and the various possible 
ways the boxes can be occupied by observers. 


9.4.2 First Person Probabilities for Observation 


We are one of the observers in one of the histories. We now ask for the theory’s prediction 
for the first person probability that we observe red (WOR). To calculate that, assume that 
in either history we are equally likely to be any one of the occurrences of E. Then the 
probability that we observe white (WOR) is evidently the probability that we are in the 
history with all red boxes (k = 1). 

The probability that WOR is nof the probability p; that the history with all white boxes 
occurs because that could happen with no boxes occupied by observers. Rather the prob- 
ability for WOR is the probability that history k = 1 occurs with at least one observer. 
Denoting “at least one E” by E= we have 


p(E= in 1) = 1 — p(mo Ein 1) 
=1-(1~pe)™. (9.3) 
The normalized probability that we observe red (WOR) is then 


pill — (1 —pe)y™)] 


OP) (WOR) = 
la 


(9.4a) 
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Similarly, the probability that we observe blue is 


poll — ( — pry?) 
>, Pell — Cl = pe)%*] | 


p\'?) (WOB) = (9.4b) 


We now discuss important limiting cases. 

Rare in all histories: When both ppN, < 1 and ppN2 < 1 physical systems like us 
occur only rarely in each of the two possible histories. Therefore we can assume that as a 
physical subsystem we are unique in the universe. The probabilities for color observation 
(9.4) then become 


N 
pl) (WOR) ~ ——P1__ (9.5a) 
Nipi + Nop2 
N 
p"P) (WOB) =~ ——2P2_, (9.5b) 
Nip + Nop2 


The first person probabilities for our observation of red or blue are the third person proba- 
bilities of the all red and all blue histories weighted by the number of Hubble volumes in 
each. This is called “volume weighting”. It favors larger universes where there are more 
places for us to occur as has been extensively discussed in cosmology (e.g. Hartle et al. 
[5]; Page [13]; Hawking [14]). 

It is important to emphasize that volume weighting is not an extra assumption in addition 
to the theory of the histories. Rather it is a straightforward consequence of that theory in 
models where we are rare in all histories. 

In a third person description of this model blue is more likely to occur if p2 > p;. But 
red is more likely to be observed if Nip; > N2p2. Thus we have an elementary example of 
what is most probable to be observed is not necessarily what is most likely to occur. That 
is the return of the observer. 

Other limiting cases are also interesting: 

Common in both histories: When both pgeN; > 1 and peN2 > 1, copies of us as 
physical systems are common in both histories. Then the probabilities (9.4) are 


p"?) (WOR) © pi, (9.6a) 
p"?) (WOB) © po. (9.6b) 


Thus, when all histories in the ensemble are very large, universes what is predicted to be 
observed is also what is predicted to occur. 

Rare in one history, common in the other: When, say, pgN, >> 1 but peN2 « 1, copies 
of us as physical systems are common in the all-red history and rare in the blue one. We 
have [1 — (1 — pry] x land[1-—-(1 — peyN?] = Nope © 0 since pz is very, very small. 
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p"') (WOR) = 1, (9.7a) 
p'?) (WOB) = 0. (9.7b) 


This is the case when the return of the observer has its most striking effect. The first person 
probabilities select large histories over small ones even when the latter have larger third 
person probabilities. We will see a concrete example of this in a more realistic model in 
Section 9.8. 


9.4.3 A Crisis of Computability? 


Box models divide the theory for predicting first person probabilities for what we observe 
into two parts. First, there is the specification of the third person probabilities p; for the 
large scale features of the models — the number of Hubble volumes and the color of each. 
Second, there are the third person probabilities for the occurrence of observers inside each 
Hubble volume, summarized by the one number pz, which are used to describe and to 
condition on the observational situation. 

As we will see in Section 9.7, it is possible to make computationally tractable calcula- 
tions of px in simple models. But the probability pg would naturally include the probability 
that human observers evolved in a Hubble volume. To calculate this, or even estimate it, 
would involve considering several billion years of the chance accidents of biological evo- 
lution. This is well beyond our power to compute even assuming that we have a theory that 
is well enough formulated to define the task. 

It is therefore fortunate that in all of the interesting limiting cases discussed above the 
probability pz cancels out. Thus, it is in the regime of universes so small that we are unique, 
or universes so large that we are common, that observations are easily and objectively 
calculated. 


9.5 Anthropic Selection is Automatic in Quantum Cosmology 


Consider again the simple box model in Section 9.4.2 but suppose that the probability for 
an observer to occupy a red box pk is different from the probability pe to occupy a blue box. 
Suppose further that for some reason red is necessary for observers so that the probability 
pe to occupy a blue box is exceedingly small. It is then obvious, and easily worked out, 
that the first person probability that we observe red is very near unity. This is a very simple 
example of anthropic selection: The all-red history is selected because observers do not 
exist in the alternative. 

The key point here is that anthropic selection is automatic. No additional assumptions 
or principles were needed beyond the probabilities p1, p2, pk and pe which are all third 
person probabilities following from the underlying theory. Anthropic selection emerges as 
an intrinsic feature of the first person probabilities for the observer’s observations [15]. 
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This is the case also in more general and more realistic cosmological models. As an 
example consider an ensemble of single time histories which we take to be at the present 
age of the universe fg ~ 14 Gyr. Assume that the theory (J, Y) predicts third person 
probabilities for what we may call a set of background histories each with the same number 
of Hubble volumes but differing in the value of the cosmological constant A, which is 
assumed positive and the same in all Hubble volumes of one history. The theory thus 
allows the cosmological constant to vary. The histories can be labeled by the value of A 
and their third person probabilities written p(A). As a fluctuation on these backgrounds 
the theory (J, VW) also predicts a probability that in any Hubble volume there occur data 
Do that describe our observational situation (but not including any record we might have 
of the value of A). These probabilities depend on A. The complete set of histories is thus 
labeled by the backgrounds and which Hubble volumes in them are occupied by Do. 

Assume that we are typical of the incidences of Do in any one history. The first person 
probability that we observe a value of A (WOA) is the third person probability for the 
history A conditioned on the existence of at least one instance of Do (cf. Eq. (9.3) et seq.). 
Using the Bayes identity we have 


p(D='|A)p(A) 
PDs '|A)p(A) 


p"P)(WOA) = p(A|Dz') = (9.8) 


For values of A for which p(D5' |A) is negligibly small the probability that we observe 
that value will also be negligibly small. Hence anthropic selection of values of A that are 
consistent with observers is automatic. This kind of argument assumes that the third person 
theory allows A to vary over an anthropically allowed range — a range consistent with the 
evolution of Do. We give an example of such a model in Section 9.8. If however the theory 
determines a unique value of A, then either that must be consistent with De or the theory 
is incorrect (e.g. Hartle [21]). 

Barrow and Tipler [16] and Weinberg [17] have argued that the observed value of the 
cosmological constant could not be much larger than A ~ 107!*? in Planck units, not far 
from its observed value. Were A larger the universe would expand too rapidly for galaxies 
to have formed by the present age fg ~ 14 Gyr and human observers would not be here. In 
our scheme the probability p(Ds' |A) would be near zero. 

This kind of argument is an example of what one could call traditional anthropic reason- 
ing. In this, the anthropically allowed range of values of an observable like A is determined 
from classical arguments like those involving galaxy formation mentioned above. It is then 
assumed that there is some unknown mechanism for A to vary over this range. In the 
absence of a specific model of this mechanism a uniform distribution over the range is 
often assumed and the most probable A predicted on the basis of purely anthropic argu- 
ments. Impressively detailed calculations were carried out in this way with mixed results, 
e. g. Tegmark et al. [19]. But this program suffers from several uncertainties. For example, 
different results are obtained if different combinations of constants were assumed to vary 
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[20]. In addition, traditional anthropic reasoning is not part of any theoretical framework. 
Rather anthropic selection arises from an additional assumption or “principle”. 

The chief difference with anthropic selection in quantum cosmology is that (1) there 
the theory (7, Y) provides a mechanism for what constants vary and how they vary, (2) the 
observer is a physical subsystem with a certain probability predicted by the theory to evolve 
in any Hubble volume, and (3) anthropic selection emerges automatically as a property of 
the first person probabilities by which we use and test the theory. To summarize, anthropic 
selection in quantum cosmology: 


e Is not a choice. 

e Does not require invoking some “anthropic principle”. 

e Does not change the objective nature of the underlying third person theory. 
e Does require a typicality assumption if there is a chance we are replicated. 


The observer is important for first person probabilities for observations through 
anthropic selection. In Section 9.8.3 we will describe a more precise calculation of 
p\'P) (WOA) in which the probabilities p(A) in Eq. (9.8) are obtained from a concrete 
model of the universe’s quantum state YW in combination with a dynamical theory / in 
which A can vary. 


9.6 A Remark on Coarse Graining 


Our observations of the universe extend at most over a Hubble volume. But this may only 
be a tiny region in a vastly larger inflationary universe of the kind contemplated in con- 
temporary cosmology. Indeed, some calculations [24] suggest that the universe typically 
becomes spatially infinite as a consequence of eternal inflation. Much of the third person 
information about what occurs in the universe on very large scales is irrelevant for the 
first person predictions for our observations in our Hubble volume, and perhaps not even 
well defined. In this section we illustrate how, in certain models, first person probabilities 
for our observations can be calculated directly using coarse grainings that ignore most of 
the structure outside our Hubble volume. Coarse graining is not an ad hoc assumption. It 
is central and inevitable in quantum mechanics, statistical physics, complexity, and many 
other areas of science (e.g. Gell-Mann and Hartle [29]). 

We illustrate how coarse graining works with a simple box model of the kind used in 
Section 9.4. Consider a universe with an infinite set of boxes as illustrated in Figure 9.2. 
Each box has a color. This is either yellow with probability py or green with probability 
PG = | -— py. There is a probability pg for an observer to be in any box observing its 
color. Thus all the boxes are statistically identical. In this sense this universe has a discrete 
translation symmetry. 

Now we ask for the first person probability that we observe yellow. The answer is 
obvious. We are in one box. All the boxes are statistically the same. The first person prob- 
ability p“?) (Y) for observing yellow is the same as the probability that any of the boxes is 
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Figure 9.2 Fine and coarse-grained histories of a box model discussed in the text. The boxes model 
Hubble volumes. Their color models an observable like the CMB in this case either yellow or green. 
An “E” means that an observer is in the box observing its color. The top history is fine grained with a 
color and E or not E in every box. The bottom history is coarse grained. The possibilities have been 
summed over for every box except one — our box. That enables a straightforward calculation of the 
first person probability that we see one color or the other. The details outside our box are irrelevant 
for this. (GRN = green and YEL = yellow colors; see arXiv:1503.07205 for color figures). 


yellow, viz. 
p"P)(Y) = py. (9.9) 


Although obvious, it is instructive to see how this result follows from our general frame- 
work through coarse graining starting from third person probabilities. A fine-grained 
history would specify the color (Y, G) of each box, and whether there exists an observer in 
it or not (E, £) (top in Figure 9.2). The third person probability for one particular history 
having ny white boxes, ng gray boxes, ng boxes with observers, and nj without, is 


(py)" Cl — py)" (pe) (1 = pe)". oa) 


But these probabilities tend to zero in a universe with an infinite number of boxes. The 
probabilities for these fine-grained histories are not well defined. 

Finite probabilities can be obtained by coarse graining. That is, they can be obtained 
by summing Eq. (9.10) over what’s irrelevant for our observations. To get first person 
probabilities for our observations we can coarse grain over the alternatives in every box but 
ours. That means sum the probabilities Eq. (9.10) over the alternatives (Y, G) and (E, E) 
giving a factor of unity for every box but ours (bottom in Figure 9.2). That gives the first 
person probabilities. 

This result generalizes straightforwardly to any finite number of kinds (colors) of Hub- 
ble volumes, to models where the probabilities pz depend on the color, and to more 
than one configuration (history) of boxes with third person probabilities like the p(k) of 
Section 9.4.1. In Section 9.8 we review an example of coarse graining used in realistic 
cosmology (see also Hartle et al. [9]). 

To summarize, in infinite or just very large universes focusing on our observations in 
our Hubble volume motivates coarse grainings that directly lead to well defined first per- 
son probabilities for observations. It is an intriguing open question whether such a local 
framework for prediction can be achieved more generally in quantum cosmology. Such a 
framework would truly return the observer to importance. 
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9.7 Inflation in Quantum Cosmology 


We now turn from illustrative but artificial toy models to more realistic cosmological 
models. In this and the next section we consider two examples in which the return of 
the observer is important — where the probabilities for what we observe are significantly 
different from the probabilities for what occurs. 

The two examples share a common theoretical framework (/, Y). For the dynamics we 
assume a Spatially closed cosmological spacetime metric g coupled to a number of scalar 
fields @. The dynamics is specified by an (Euclidean) action Tg, é] consisting of the action 
for general relativity plus the action for the scalar fields b coupled to the metric g. For the 
state YW we assume the no-boundary wave function of the universe (NBWF) [10]. This is 
the natural analog of the notion of “ground state” for closed cosmologies. 

Many of our large scale observations are of properties of our universe’s classical history. 
The rate of the universe’s expansion and the distribution of galaxies in our Hubble volume 
are examples. A history of the universe behaves classically when the quantum probability 
is high that it exhibits correlations in time governed by the Lorentzian Einstein equation 
and the classical field equations. The NBWF predicts an ensemble of alternative classical 
histories along with third person probabilities for which history in the ensemble occurs. 
For a little more on what the NBWF is, and how it predicts probabilities see Appendix 9.2. 
For much more see Hartle ef al. [6]. 

The first example is concerned with the classical cosmological histories predicted by the 
NBWE when the matter consists of a single scalar field moving in a quadratic potential 


V(o) = sng? (9.11) 


with m2 


<_ 1 and zero cosmological constant. As in the rest of this chapter we are 
using Planck units where h = 8%1G = c = 1. Geometry and field are restricted to 
be homogeneous and isotropic thus defining a minisuperspace model. Lorentz-signatured 


homogeneous and isotropic spacetime geometries can be described by metrics of the form 
ds* = —dt? + a* (dQ. (9.12) 


Here, d O23 is the metric on a unit round three-sphere. The time-dependence of the scale fac- 
tor a(t) describes how this closed universe expands and contracts. Standard closed FLRW 
cosmological models have metrics of this form that satisfy the Einstein equation. The 
homogeneous field is a function only of time, viz ¢ = (t). A quantum history of this 
model universe is therefore specified by (a(t), #(1)). Classical histories are described by a 
pair (a(t), @(f)) that obey the Einstein equation and the classical equations of motion for 
the field. 

As sketched in Appendix 9.2 the NBWF in this minisuperspace model predicts a one- 
parameter ensemble of possible classical histories. These classical histories can be labeled 
by a parameter ¢o that can be roughly thought of as the value of the scalar field from which 
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it starts to roll down to the bottom of the potential Eq. (9.11). By examining any one clas- 
sical a(t) we can find the number of efolds N. of slow roll inflation it has. Remarkably all 
histories in the NBWF classical ensemble turn out to have some inflation at early times [6]. 
Inflation and the emergence of a classical universe in the NBWF are therefore profoundly 
connected [6, 28]. 

Does our theory (J, Y) predict a significant probability for an extended period of infla- 
tion in the early universe? As stated, this question is ambiguous. It could mean “Are the 
third-person probabilities from (J, YW) high for classical histories with an early period of 
inflation?” But it could also mean “Are the first person probabilities high that the classical 
history we observe has a significant period of early inflation?” We will display the answers 
to both questions in our model and find that they are significantly different. 

To evaluate the first person probabilities for the amount of inflation we can assume that 
we are rare physical systems in any of the classical histories predicted by the NBWF. 
This is because the number of Hubble volumes in all histories of the classical ensemble 
with matter densities below the Planck density is much smaller than any realistic value of 
Be! [7]. Hence volume weighting Eq. (9.5) connects the first person probabilities for our 
observations with the third person probabilities for what occurs. 

The third person probabilities p = p(@o) for the histories predicted by the NBWEF are 
given approximately by Eq. (9.29) 


P(bo) « exp[37/V(go)I. (9.13) 
It turns out in this model that roughly Ne ~ (3/ 2)$6 starting at N. © 1. The third person 


NBWE probabilities for classical histories Eq. (9.13) thus imply probabilities p(V.) for the 
number of efolds that occur, starting roughly at N, ~ 1, 


Re a (9.14) 
m2No 


Third person probabilities are therefore much larger for a small number of efolds than 
for the minimal number ~ 60 required for agreement with observation, as illustrated in 
Figure 9.3. 

The situation is much different for first person probabilities for the amount of inflation 
in our history — the one we observe. Since it is assumed that we are rare, the first person 
probabilities are the volume weighted third person probabilities as in Eq. (9.5). During 
a period of inflation the volume of the universe expands by a factor exp(3N-). The first 
person probabilities are then approximately 


9 
p"P)(N.) o exp (3m. + a ) (9.15) 
m-Ne 


Thus the NBWF predicts a significant probability for us to find ourselves in a universe with 
a large number of efolds (Figure 9.3). This implies there is a significant probability for us 
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Figure 9.3 Third (left) and first person (right) probabilities for the number of efolds Ne of matter- 
driven inflation in the early universe in the no-boundary quantum state VW. The first person 
probabilities favor universes with a large amount of inflation because in the larger universes that 
result from an extended period of early inflation there are more Hubble volumes for us to be. 


to observe the consequences of inflation — approximate spatial flatness, a scale invariant 
spectrum of density fluctuations, etc. 

There is a higher probability for us to live in a larger universe because we are a physical 
subsystem within the universe with a very small probability pg to have evolved in any 
Hubble volume. In the larger universes that result from an extended period of early inflation 
there are more Hubble volumes for us to be. Thus the observer returns to importance in the 
probabilities for observing the consequences of inflation. 


9.8 Eternal Inflation and Anthropic Selection 


We now extend the above model in three crucial ways to arrive, at last, at a realistic model 
for the early universe: 

(1) Fluctuations: We include fluctuations away from homogeneity and isotropy. These 
fluctuations provide the necessary degrees of freedom to describe e.g. the pattern of tem- 
perature variations in the CMB. A consequence is that a range of classical histories exhibit 
eternal inflation becoming spatially very large and highly inhomogeneous on the largest 
scales. 

(2) Landscape potential: We no longer assume one scalar field in a potential with a single 
minimum like Eq. (9.11). Rather we assume many scalar fields in a multi-field potential 
that has many different minima with different directions of approach, as a toy model for 
the string landscape [27]. As a consequence the dynamical theory provides a mechanism 
for the observable parameters of the histories to vary. Automatic anthropic selection can 
then be explicitly illustrated. 

(3) Not rare but common: The above assumption that we are rare in all histories is 
no longer tenable with eternal inflation. The probability pz for us to exist in any Hubble 
volume is very small. Nevertheless, we will be common in a range of histories where the 
universe becomes sufficiently large. In that range the connection between third and first 
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person probabilities will no longer involve volume weighting but instead be given by Eq. 
(9.6). 
We now consider the meaning and implications of these extensions in more detail. 


9.8.1 Eternal Inflation: Histories where Observers are Common 


Cosmological perturbation theory extends the homogeneous and isotropic models of 
Section 9.7 straightforwardly to include linear fluctuations away from these symmetries 
for each classical history. The early period of slow roll inflation in the classical NBWF 
histories stretches and amplifies quantum vacuum fluctuations and generates a pattern of 
classical perturbations on scales larger than the horizon, which much later produce the 
small temperature fluctuations we observe in the microwave sky. However, it turns out that 
very long-wavelength fluctuations, those which leave the horizon and become classical at 
values of @ where 


V? > |dV/dol’ , (9.16) 


have a large expected amplitude. There is non-perturbative evidence that histories in which 
¢ is initially in this regime reach very large (or even infinite) spatial volumes, because 
these large very long-wavelength fluctuations tend to make them highly inhomogeneous 
on scales much larger than our Hubble volume [22-24]. This dynamical process is known 
as eternal inflation [25, 26], and Eq. (9.16) defines the regime of field values where eternal 
inflation occurs. 

For quadratic potentials like Eq. (9.11), the condition for eternal inflation Eq. (9.16) is 
met for sufficiently large dp > dei where 


dei ~ 1//m, (9.17) 


well below the Planck scale $p| ~ 1/m for realistic values m ~ 10->. 

Therefore, when we include fluctuations the set of classical NBWE histories divides into 
two parts: Those with ¢9 = ¢ei which are inhomogeneous on the largest scales and have a 
great many Hubble volumes N due to eternal inflation, and those with ¢o S ¢ei which are 
much smaller and approximately homogeneous. 

This division has significant implications for the first person probabilities for obser- 
vation. We are now effectively in the box model case Eq. (9.7) where (assuming 
typicality) 


[1-(—pey1* 1, $0 > deis (9.18a) 
[1-(—pe)"]* peN K1, do < dei. (9.18b) 
Eternally inflating histories are thus strongly selected, whereas histories with slow roll 


inflation only are strongly suppressed by the very small value of pz. In selecting for 
eternally inflating universes as the ones we observe, the observer has returned in force. 
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9.8.2. Landscapes: A Mechanism for Cosmological Parameters to Vary 


As discussed in Section 9.5, the anthropic selection of observed cosmological parameters 
requires a third person theory (J, Y) that allows the parameters to vary. Theories with 
landscape potentials are a very simple example. 

To illustrate what landscape potentials are, consider a third person theory with two scalar 
fields ¢; and ¢2 moving in a potential V(¢1, ¢2). A three-dimensional plot of this potential 
could be made using ¢; and ¢2 as the x and y-axes and plotting V along the z-axis. The plot 
might resemble a mountainous landscape on Earth whence the name “landscape potential”. 

Suppose the potential has a number of different minima each surrounded by a number 
of different valleys leading to it.° In our past history the fields “rolled down” a particular 
valley (“our” valley) to a particular minimum (“‘our” minimum). The value of the potential 
at our minimum is the value of the cosmological constant we would observe. The shape of 
our valley near our minimum determines the spectrum of density fluctuations in the CMB 
we would see. A third person theory that predicts probabilities for which of the possible 
histories occurs is thus a starting point for calculating the first person probabilities for the 
values of these parameters we observe. 

The notion of a landscape potential is easily extended to many scalar fields b so that V = 
Vi). Assume that each minimum in Vio) is surrounded by effectively one-dimensional 
valleys. Suppose that these valleys are separated by large barriers so that transitions 
between valleys are negligible. Then we have, in effect, an ensemble of one-dimensional 
potentials Vx(@x) K = 1,2,--- — one for each valley. 

The NBWF predicts the probabilities for classical inflationary histories for each one- 
dimensional potential in this landscape. The total classical ensemble is the union of all 
these. An individual history can therefore be labeled by (K,¢o0x) where, as explained 
before, @ox is roughly the value of the field @x at the start of its roll down the poten- 
tial Vx. Thus the NBWF predicts third person probabilities p(K, ¢@ox) that our universe 
rolled down the potential from ¢ox in the valley K to its minimum. 

Generically a landscape of the kind under discussion contains some valleys K where the 
potential Vx (x) has a regime of eternal inflation. Eq. (9.18) implies that histories in these 
valleys will dominate the first person probabilities. In the presence of eternal inflation the 
first person probabilities are 


p'?(K, xo) © p(K, oxo) © expl3x/Vx(bxo)], Ko = OKei 
~0, oxo S kei (9.19) 


When the potentials are increasing with field, the most probable NBWF history in a given 
valley will be that where the field starts around the exit of eternal inflation, i.e. dxo = Oxei- 
Thus, to good approximation, the probability that we rolled down in valley K is 


p'P)(K) © p(K) © exp[3x/Vk (bxei)- (9.20) 


3 Valleys were called “channels” in Hartle et al. [9] and other places. 
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We now apply this result in a concrete model landscape. 


9.8.3 First Person Predictions of the No-Boundary State in a Landscape Model 


We now discuss an example of a first person prediction in a landscape model with valleys 
where the eternal inflation condition Eq. (9.16) is satisfied. This example relates directly to 
the historical effort described briefly in Section 9.5 to determine the anthropically allowed 
ranges of cosmological parameters that are consistent with our existence as observers (see, 
e.g. Barrow and Tipler [16]; Weinberg [17]; Tegmark et al. [19]; Livio and Rees [20]). 

Specifically, we calculate the NBWF predictions for the first person probability of a cor- 
relation between three observed numbers: First is the observed value of the cosmological 
constant A ~ 10~!?3 (in Planck units). Second is the observed value of the amplitude of 
primordial density fluctuations Q ~ 10~> in the CMB. And third is the part of our data Djoc 
on the scales of our galaxy and nearby ones, which, together with A, determine the present 
age of the universe fo(A, Dioc) ~ 14Gyr. Thus we are interested in pp) (A, Q, Dioc) — the 
first person probability for this correlation to be observed. Many other such correlations 
could be investigated to test a given theory.’ But this example will nicely illustrate several 
implications of the previous discussion. 

We consider a landscape potential in which the parameters A and Q vary so the theory 
(7, YW) will predict probabilities for their values. Specifically, we assume a landscape of 
different one-dimensional valleys with potentials of the form> 


1 
Vk(~) = Ak + 5K OK Kea1pes., (9.21) 


A valley in this landscape is therefore specified by the values (A,m). (From now on we 
will drop the subscripts K to simplify the notation.) 

The NBWF predicts an ensemble of inflationary universes in this landscape. The 
observed classical history of our universe rolled down one of the valleys — “our valley”. 
The values of A and m that characterize our valley can be determined by observation. Mea- 
surement of the expansion history of the universe determines A and CMB measurements 
determine m, because the amplitude of the primordial temperature fluctuations is given by 
Hartle et al. [8] and Hartle and Hertog [15] 


OX Nm’. (9.22) 


Here N, ~ O(60) is the number of efolds before the end of inflation that the COBE scale 
left the horizon during inflation. Observations indicate that Q ~ 107°. 

The first person probability that we observe specific values of A and Q is the first person 
probability that our past history rolled down the valley which has those values. Since each 


4 For a recent discussion of observables associated with features of the CMB fluctuations see e.g. Hartle and 
Hertog [15]. 
5 For examples of more general landscapes see Hertog [28]. 
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of the valleys Eq. (9.21) has a regime of eternal inflation for @ > ge ~ 1/,/m the first 
person probabilities Eq. (9.20) that we observe values of A and Q can be written as 


p'P)(A, Q) © p(A, Q) x exp[3x/(A + cO/N,)]. (9.23) 


where c ~ O(1). For fixed A this distribution favors small Q. As mentioned above, we are 
interested in the joint probability pip) (A, Q, Dioc) © p(A, Q, Dioc) (first equals third in a 
regime of eternal inflation). The latter can be rewritten® 


P(A, Q, Dioc) = Piocl|A, Q)p(A, Q). (9.24) 


The second factor in the expression above is given by Eq. (9.23). The first will be propor- 
tional to the number of habitable galaxies that have formed in our Hubble volume by the 
present age to(A, Djoc). This is plausibly proportional to the fraction of baryons in the form 
of galaxies by the present age fo assuming that is bigger than the collapse time to form a 
galaxy. Denoting this by f(A, Q, fo) we have from Eq. (9.23) 


p'P(A, O, Dioc) © P(A, Q, Dioc) & f (A, Q, to) exp[37r/(A + cQ/Nx)]. (9.25) 


All the quantities in Eq. (9.25) are determinable in one Hubble volume — ours. The rest of 
the calculation can therefore be carried out in that volume. We do not need to ask in which 
of the vast number of Hubble volumes in the third person eternally inflating universe we 
dwell. In this model they are all statistically the same as far as the quantities in Eq. (9.25) 
go. We are effectively in the situation of the yellow-green box model in Section 9.6. We 
can coarse grain over all other Hubble volumes but ours. In this way we are able to make 
contact with — and use — calculations in traditional anthropic reasoning. 

Figure 9.4 is a contour plot of f at f ~ 11 Gyr — earlier than the present age 14 
Gyr because galaxies have been around for a while. This was adapted from the detailed 
astrophysical calculations of Tegmark et al. [19]. For values of (A, Q) to the right of the 
diagonal dotted line the universe accelerates too quickly for pre-galactic halos to collapse. 
Fluctuations have to be large enough to collapse into galaxies a bit before fo and produce 
Djoc. That determines the bottom boundary. If the fluctuations are too large (top boundary) 
the bound systems are mostly large black holes inconsistent with Djoc. The central region 
where f > 0.6 is the anthropically allowed region. 

The cosmological constant A is negligible compared to Q in the anthropically allowed 
(white) region of Figure 9.4. The exponential dependence exp [37 N,./cQ] implied by the 
NBWE means that the probabilities Eq. (9.25) are sharply confined to the smallest allowed 
values consistent with galaxies by fo, i.e. Q ~ 107°. The resulting marginal distribution 
for A is shown in Figure 9.5 and is peaked about A ~ 107! close to the observed value. 


6 Tn traditional anthropic reasoning the first factor in Eq. (9.24) is called the “selection probability” which can be 
calculated; the second is the “prior” which is assumed. See, e.g. Eq. (1) in Tegmark et al. [19]. The “prior” is 
typically assumed to be uniform in the anthropically allowed range reflecting ignorance of part of the theory. 
Here both factors follow from the same theory (/, V). 
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Figure 9.4 A contour plot of the fraction f(A, Q,Djoc) of baryons in the form of galaxies by the 
time fg ~ 11 Gyr, adapted from the calculation of Tegmark et al. [19]. As discussed in the text, the 
fraction is negligible in the dark (gray) region either because gravitationally bound systems do not 
collapse or because they collapse to black holes. The « locates the observed values of Q and A. 


However, the agreement with observation is not the most important conclusion. The 
model is still too simplified for that. What is important is how the example illustrates 
the previous discussion. Specifically, how anthropic reasoning emerges automatically in 
quantum cosmology, how it can be sharpened by a theory of the universes’s quantum state, 
how first person probabilities select for large eternally inflating universes, and how what is 
most probable to be observed is not necessarily what is most probable to occur — the return 
of the observer. 


9.9 Conclusion 


Fundamental physical theories have generally been organized into a third person theory 
of what occurs and a first person theory of what we observe. In quantum mechanics the 
output of the two parts are third person probabilities and first person probabilities. The 
character of these two parts, their relative importance, and the relation between them has 
changed over time as new data on new scales of observation required new theory and as 
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Figure 9.5 The marginal distribution for the cosmological constant A obtained by integrating Eq. 
(9.25) over Q. 


the understanding of our observational situation in the universe evolved. The history of 
the transitions from classical physics to Copenhagen quantum mechanics, and then to the 
quantum mechanics of closed systems that was briefly sketched in the Introduction is part 
of this evolution. 

As we discussed in this chapter, modern cosmology implies new requirements on both 
the first person theory of what occurs in the universe and the third person theory of what 
we observe of the universe. 

A third person theory of the whole universe requires a quantum mechanics of closed 
systems including quantum spacetime as in Hartle [4] for example. It requires a quan- 
tum mechanics that predicts probabilities for what happens in the early universe when 
there were no observers around and no measurements being made. It requires a quantum 
mechanics that predicts probabilities for the emergence of classical spacetime in a quantum 
theory of gravity. It requires a quantum mechanics that can explain the origin of the rest of 
classical predictability in distant realms of the universe that we will never visit. 

Quantum cosmology also has implications for first person theory. As observers, individ- 
ually and collectively, we are physical systems within the universe with only a probability 
to have evolved in any one Hubble volume and a probability to be replicated in many if 
the universe is very large. What we have seen in this chapter is that this implies that the 
observer returns to importance in at least the following ways: 


e By generally showing that what is most probable to be observed is not necessarily the 
most probable to occur. 

e By favoring larger universes over small ones if we are rare or if we are common and thus 
favoring observations of a significant amount of inflation. 

e By making anthropic selection automatic. We will not observe what is where we cannot 
exist. 
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e By requiring the addition of an assumption of typicality to the theory of what occurs that 
is made concrete by the xerographic distribution — an addition that can be tested and used 
to improve prediction. 

e By leading to an understanding of how to compute probabilities for our observations of 
fundamental constants in a landscape that allows them to vary. 


This list is a brief summary of the results of this chapter, but these lead naturally to a 
number of questions that we discuss here. 

A natural question is why the observer was not important in classical cosmology when 
it is in quantum cosmology. An answer can be traced to differences in starting points and 
objectives. The starting point for classical cosmology was the assumption that the universe 
had a single spacetime geometry. The goal was to infer the geometry of that spacetime from 
large scale observation. Is it approximately homogeneous and isotropic on large scales, 
open or closed ...? What are the values of the cosmological parameters that characterize it 
— the cosmological constant, the Hubble constant, the amount of radiation, the amount of 
baryons, the amplitude of the density fluctuations, etc.? Is there evidence that the spacetime 
had an early period of inflation? Observers were presumed to exist but had a negligible 
influence on the answers to these questions. 

Quantum cosmology does not start by assuming classical spacetime. Rather it starts from 
a theory of the universe’s quantum state and dynamics. From that it seeks to explain when 
spacetime is classical and predict probabilities for what different possible classical space- 
times there are — questions that cannot even be asked in the context of classical cosmology. 
It therefore answers the classical questions with probabilities about large scale geometry. 

A second natural question is, why do fundamental physical theories encompass both 
a third person theory of what occurs and a first person theory of what is observed? The 
historical success of theories organized in this way is indisputable. That success tells us 
something about the world. It supports the idea of some form of realism, perhaps along the 
lines of what Putnam called ‘realism with a human face’ [31]. To paraphrase J. A. Wheeler 
“In a quantum world, the universe is a grand synthesis, putting itself together all the time 
as a whole. Its history is ... a totality which includes us and in which what happens now 
gives reality to what happened then” [30]. 

A third natural question concerns the scale on which we have to know something of 
the universe to make first person predictions. An IGUS’ like us is confined to a very local 
region inside of a vast Hubble volume which itself is typically but a small part of a much 
larger universe. Yet the present formulation of the first person theory requires information 
beyond our Hubble volume to determine whether we are rare, or common, or other. It 
remains to be seen whether a more local computation of probabilities for observation in 
quantum cosmology can be found. That would be a further way the observer and universe 
are unified. 


7 Information gathering and utilizing system. 
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Everett’s insight was that, as observers of the universe, we are physical systems within 
it, not outside it. We are subject to the laws of quantum mechanics but play no special role 
in its formulation. We are negligible perturbations on a third person description of the uni- 
verse. But, as shown in this chapter in several different ways , we return to importance for 
calculating first person probabilities for our observations precisely because we are physical 
systems in the universe. We may have only begun to appreciate the ramifications of that 
insight. 
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Appendix 9.1: Typicality - The Xerographic Distribution 


Consider for a moment a third person theory — classical or quantum — that predicts a large 
number of Hubble volumes, some with one kind of observable property, some with another. 
A box model like those of Section 9.4 with one history and different colored volumes would 
be a very simple example. 

Suppose that the data Do that describe our observational situation (including us) occur 
in many different Hubble volumes. One of them is ours, but in other volumes the result 
of the observation specified by Do could be different. A third person theory (7, ¥) would 
predict what is observed in all of the instances of Do. But it does not predict which one is 
ours. Indeed, it has no notion of “we” or “us’’. 

We do not know which of these copies of Do are us. It could be any one of them. To 
make first person predictions for our observations, the third person theory must therefore 
be augmented by an assumption about which instance of Do is us, or more generally with 
a probability distribution on the set of copies. If there is no such assumption there are no 
predictions. Put differently, predictions for our observations require a statement on what 
exactly we mean by “us” — a specification of how we think our particular situation relates 
to other instances of Do in the universe. 

In the body of this chapter we have consistently assumed that we are equally likely to be 
any of the incidences of our data Do that the third person theory (7, W) predicts. This is the 
simplest and least informative assumption but it is not the only possible one. Other, more 
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informative assumptions may lead to better agreement between theory and observation, be 
more justified by fact, and be more predictive.® 

We call a distribution that gives the probability that we are any of the incidences of Do or 
a subset of it a xerographic distribution [11]. It is usually written &; where the index ranges 
over all incidences of Do. A xerographic distribution is effectively a formal expression of 
the assumption about how typical we are in the universe in the set of all other incidences 
of Do. 

It is the theoretical structure consisting of (7, Y) and é that yields first person predic- 
tions for observation. Each of the elements in this combination is therefore testable by 
experiment and observation. Just as we can compete different (7, Y) by their predictions 
for observation we can also compete different &. 


Appendix 9.2: The No-Boundary Wave Function in Homogeneous and Isotropic 
Minisuperspace 


The example in Section 9.7 assumed the no-boundary wave function (NBWF) for the quan- 
tum state Y in the third person theory (J, ¥). The NBWF is the analog of the ground 
state for closed cosmologies and therefore a natural candidate for the wave function of 
our universe. Its predictions for observations are in good agreement with the results of 
actual observations. For example, it predicts that fluctuations away from homogeneity and 
isotropy start out near the big bang in their ground state [33]. Combined with its prediction 
that our universe underwent an early period of inflation [5] that leads to good agreement 
with the observed fluctuations in the CMB. This Appendix provides the reader a little more 
explanation of what that wave function is and how its consequences used in the examples 
are derived. For more detail see Hartle and Hawking [10]. 

A quantum state of the universe like the NBWF is represented by a wave function on 
a configuration space of three geometries and matter field configurations on a spacelike 
surface &. On the minisuperspace of homogeneous and isotropic geometries, Eq. (9.12), 
and a single homogeneous scalar field we write 


WU = Wb, x). (9.26) 


Here b is the scale factor of the homogeneous, isotropic metric Eq. (9.12) on & and x is 
the value of the homogeneous scalar field. The no boundary wave function is a particular 
wave function of this form. 

The NBWF is formally defined by a sum over homogeneous and isotropic Euclidean 
geometries that are regular on a topological four-disk and match b and x on its boundary 
[10], weighted by exp(—J) where / is the Euclidean action of the configurations. We will 
not need to go into this construction. We will only need its leading order in fi semiclassical 


8 As for the Boltzmann brain problem [12]. Boltzmann brains are not a problem if we assume that our 
observations are not typical of deluded observers who only imagine that they have the data Do. 
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approximation. That is given by the saddle points (extrema) of the action J[a(t), #(T)] on 
this disk for metric coupled to scalar field. There is one dominant saddle point for each 
(b, x). Generally, these saddle points will have complex values of both metric and field. 
The semiclassical NBWF is a sum over saddle points terms of the form (in units where 
h=1) 

Wb, x) « exp[—I(b, x)/h] « exp[—IR(b, x) + iS, x)] (9.27) 


where /(b, x) is the action evaluated at the saddle point and Jr and —S are its real and 
imaginary parts. 

The wave function Eq. (9.27) has a standard WKB semiclassical form. As in non- 
relativistic quantum mechanics, an ensemble of classical histories is predicted in regions 
of configuration space where S varies rapidly in comparison with Jp. The histories are the 
integral curves of S defined by solving the Hamilton—Jacobi relations relating the momenta 
Pp and py conjugate to a and ¢ to the gradients of S 


Pr=VWS. py = VS. (9.28) 


There is a one-parameter family of classical histories — one for each saddle point. It is 
convenient to label them by the magnitude of the scalar field ¢o at the center of the saddle 
point. One can think crudely of ¢o as the value of the scalar field at which it starts to roll 
down to the bottom of the potential in a classical history. 

The third person probabilities for these histories are proportional to the absolute square 
of the wave function Eq. (9.27) 


P(po) « exp[—2Ir(ho)] © exp[—32/V (0). (9.29) 


The last term is a crude approximation to the action which is useful in rough estimates 
[18]. 

This is a very quick summary of a lot of work. For more details see Hartle et al. [6]. 
In short, in the leading semiclassical approximation the NBWF predicts an ensemble of 
possible classical spacetimes obeying Eq. (9.28) with third person probabilities (9.13). 
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10 
Testing Inflation 


CHRIS SMEENK 


10.1 Introduction 


Over the last 30 years, inflationary cosmology has been the dominant theoretical frame- 
work for the study of the very early universe. The central idea of inflation is that the 
universe passed through an impressive growth spurt, a transient phase of quasi-exponential 
(“inflationary”) expansion which sets the stage for subsequent evolution described by the 
standard big bang model of cosmology. This inflationary phase leaves an imprint on vari- 
ous observable features of the universe. Observations can then constrain the fundamental 
physics driving inflation, typically described in terms of an “inflaton” field. Traces of an 
inflationary stage left in the form of temperature variations and polarization of the cosmic 
microwave background radiation (CMB) are particularly revealing regarding the inflation- 
ary phase. There are currently many models of inflation compatible with the available data, 
including the precise data sets generated by assiduous observations of the CMB. Yet there 
are ongoing debates regarding how strongly these data support inflation. Critics of inflation 
argue, among other things, that its compatibility with the data reflects little more than the 
enormous flexibility of inflationary model-building. These concerns have become partic- 
ularly pressing in light of the widespread acceptance of eternal inflation, which seems to 
imply that all possible observations are realized somewhere in a vast multiverse. 

Whether inflation can be empirically justified — whether it is “falsifiable” — is a leitmotif 
of these debates. There has been little agreement among cosmologists about how to define 
falsifiability, and whether it demarcates science from the rest as Popper intended.'! The 
question at issue is how to characterize a theory’s empirical success, and to what extent suc- 
cess, so characterized, justifies accepting a theory’s claims, including those that extend far 
beyond its evidential basis. Success defined as merely making correct predictions, merely 
“saving the phenomena”, does not provide sufficient justification, for familiar reasons. 
False theories can make correct predictions, and predictive success alone is not sufficient to 
distinguish among rival theories that happen to agree in domains we have access to. Facile 
arguments along these lines do not identify legitimate limits on the scope of scientific 


! The question of inflation’s falsifiability is discussed by several of the contributors to Turok (1997); Barrow and 
Liddle (1997) argue that inflation can be falsified. For recent discussions from opposing viewpoints see, for 
example, Ellis and Silk (2014) and Carroll (2014). These debates use Sir Karl’s terminology but do not engage 
in detail with his views about scientific method. 
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knowledge; instead, they indicate the need for a more careful analysis of how evidence 
supports theory. Philosophers have long acknowledged this need, and physicists have his- 
torically demanded much more of their theories than mere predictive success. Below I will 
focus on two historical cases exemplifying strong evidential support. The strategies illus- 
trated in these cases generalize, and inspire an account of how theory and data should be 
related for a theory to meet a higher standard of empirical success. A theory that is success- 
ful by this standard arguably makes a stable contribution to our understanding of nature, 
in the sense that it will be recovered as a valid approximation within a restricted domain 
according to any subsequent theory. 

Both strategies focus on mitigating the risk associated with accepting a theory. In the 
initial stages of inquiry, a theory is often accepted based on its promise for extending our 
epistemic reach. Theories allow us to use relatively accessible data to answer questions 
about some other domain; they provide an epistemic handle on entities or phenomena that 
are otherwise beyond our grasp. Inflationary cosmology allows us to gain access to the 
very early universe, and high energy physics, in just this sense: if inflation occurred, then 
observable features of the CMB reflect the dynamical evolution of the inflaton in the very 
early universe. Using theory to gain access to unobserverable phenomena poses an obvious 
risk. The theory provides the connections between data and the target phenomena, and the 
data provide relevant evidence when interpreted in light of the theory. How does one avoid 
accepting a just-so story, in the form of an incorrect theory that fits the data? Demanding 
strong evidence at the outset of inquiry would be counter-productive, because the best 
evidence is typically developed through a period of theory-guided exploration. The detailed 
quantitative assessment of a theory is a long-term achievement. The discussion of historical 
cases in Section 10.2 illustrates how a theory can be tightly constrained by independent 
measurements, and subjected to ongoing tests as a research program develops. 

These considerations suggest reformulating debates regarding the falsifiability of infla- 
tion with an assessment of two questions (Section 10.4). To what extent do observations 
of the early universe provide multiple, independent constraints on the physics underly- 
ing inflation? And has inflation made it possible to identify new physical features of the 
early universe that can be checked independently? Focusing on these questions allows for a 
clearer assessment of the challenges faced by cosmologists in developing evidence of com- 
parable strength to that in other areas of physics, going beyond compatibility of inflationary 
models with available observations. I will argue that the main challenge to the program of 
reconstructing the inflaton field is a lack of independent lines of evidence. But if inflation 
is generically eternal, I will briefly argue that the challenges are insurmountable: eternality 
undermines the evidence taken to support inflation, and blocks any possibility of making a 
stronger empirical case. 


10.2 The Determination of Theory by Evidence 


Assessment of the degree of evidential support for theories, drawing distinctions among 
theories that all “save the phenomena”, has long been a focus of epistemological dis- 
cussions in physics. On one extreme, some theories are merely compatible with the 
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phenomena, in that their success may reflect ingenuity and flexibility rather than accuracy. 
Although models constructed by fitting the data can be useful for a variety of purposes, 
they are not regarded as revealing regularities that can be reliably projected to other cases. 
On the other extreme, the new laws and fundamental quantities introduced by a theory are 
as tightly tied down by the evidence as Gulliver by the Lilliputian’s ropes. Even though 
such theories make claims about the structure of the natural world that go far beyond the 
data used to support them, physicists take them to accurately capture the relevant quantities 
and law-like relations among them, which can then be projected to other cases and used as 
the starting point for further work. In numerous historical cases physicists regard a given 
body of evidence as strong enough to determine the correct theory.” 

Judgments of the strength of evidence are as difficult to analyze as they are central to 
the practice of physics. To borrow an analogy from Howard Stein, the situation is akin 
to that in nineteenth century mathematics: the notion of an adequate mathematical proof, 
despite its significance to mathematical practice, had not yet been given a systematic treat- 
ment. Successful mathematical reasoning did not await the development of classical logic, 
however, just as the evaluation of physical theories proceeds without a canonical inductive 
logic. Below I will highlight two styles of argument that have been employed effectively 
in the history of physics to establish that theories have strong evidential support. Although 
I will not attempt to analyze these in detail, I hold that any proposed systematic account of 
inductive reasoning should be judged in part by its treatment of cases like these. 


One style of argument exploits a theory’s unification of diverse phenomena, exemplified 
by Perrin’s famous case for atomism. Perrin argued for the existence of atoms based on 
agreement among 13 different ways of determining Avogadro’s number N, drawing on phe- 
nomena ranging from Brownian motion to the sky’s color. This case is particularly striking 
due to the diversity of phenomena used to constrain the value of N, and also to the ease of 
comparison of different results, all characterized in terms of the numerical value of a single 
parameter. This argument was only possible due to refinements of the atomic hypothesis, 
and extensions of statistical mechanics, that allowed precise formulations of relationships 
between the physical properties of atoms or molecules and measurable quantities. Perrin 
focused on N (the number of atoms or molecules in a mole of a given substance) in partic- 
ular as a useful invariant quantity, and measured N in a series of experiments on Brownian 
motion, drawing on theoretical advances due to Einstein and others. (See Nye (1972) for a 
historical study of Perrin’s work.) By roughly 1912, Perrin’s arguments had succeeded in 
convincing the scientific community of the reality of atoms, decisively settling what had 
previously been regarded as an inherently intractable, “metaphysical” question. 

This kind of overdetermination argument has been used repeatedly in the history of 
physics (see, in particular, Norton, 2000). One common skeptical line of thought holds 
that theories are inherently precarious because they introduce new entities, such as atoms, 


2 My approach to these issues is indebted to the work of several philosophers, in particular Harper (2012), Norton 
(1993) (from whom I’ve borrowed the title of this section), Norton (2000), Smith (2014), and Stein (1994). 
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in order to unify phenomena. Success fitting a body of data, the skeptic contends, merely 
reflects the flexibility of these novel theoretical constructs. The consistent determination 
of theoretical parameters from diverse phenomena counters the worry that the theory only 
succeeds due to a judicious tuning of free parameters. The overdetermination argument 
shows that, rather than the piecemeal success the skeptic expects, the theory succeeds with 
a single choice of parameters. The strength of this reply to the skeptic depends on the 
extent to which the phenomena probe the underlying theoretical assumptions in distinct 
ways. Furthermore, the diversity of phenomena minimizes the impact of systematic errors 
in the measurement of the parameters. The sources of systematic error relevant to Per- 
rin’s study of Brownian motion have little to do with those related to measurements of N 
based on radioactivity, for example. As the number of independent methods increases, the 
probability that the striking agreement can be attributed to systematic errors decreases. 
The conclusion to be drawn from the overdetermination argument depends upon how 
unlikely the agreement is antecedently taken to be. The truth of the atomic hypothesis 
and kinetic theory implies an equation relating N to a number of quantities measurable by 
experimental study of Brownian motion, a second equation relating N to radioactivity, and 
so on. If the atomic hypothesis were false, there is no reason to expect these combinations 
of measurable quantities from different domains to all yield the same numerical value, 
within experimental error. This claim reflects an assessment of competing theories: what is 
the probability of a numerical agreement of this sort, granting the truth of a competing the- 
ory regarding the constitution of matter? The overdetermination argument has little impact 
if there is a competing theory which predicts the same numerical agreements. In Perrin’s 
case, by contrast, the probability assigned to the agreeing measurements of N, were the 
atomic hypothesis to be false, is arguably very low. In arguing for a low antecedent prob- 
ability of agreement, Perrin emphasized the independence and diversity of the phenomena 
used to determine the value of N. (Obviously this brief account highlights only one aspect 
of Perrin’s argument; see Chalmers (2011); Psillos (2011) for more thorough treatments.) 
There is a second respect in which the conclusions to be drawn from an overdetermina- 
tion argument must be qualified. These arguments typically bear on only part of a theory, 
namely whatever is needed to derive the connections between theoretical parameters and 
measurable quantities. Perrin’s case is unusual in that the evidence bears directly on the 
central question in the dispute regarding atomism, unlike other historical cases in which 
this style of argument was not as decisive. The strength of an overdetermination argu- 
ment depends on whether there is sufficient evidence to constrain all of a theory’s novel 
components, or at least the ones at issue in a particular debate. The argument only directly 
supports parts of the theory needed to establish connections between measurable quantities 
and theoretical parameters. Identifying the distinct components of a theory and clarifying 
their contribution to its empirical success is often quite challenging, as the acceptance of 
the aether based on success of electromagnetic theory in the nineteenth century illustrates. 


The second style of argument focuses on evidence that accumulates over time as a the- 
ory supports ongoing inquiry. A physical theory introduces a set of fundamental quantities 
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and laws holding among them that provide the means for explicating some domain of 
phenomena. Accepting a theory implies a commitment to account for phenomena within 
this domain on the theory’s terms, under the pressure of new discoveries and improving 
standards of experimental and observational precision. Often this involves treating com- 
plex phenomena via a series of successive approximations, with further refinements driven 
by discrepancies between current theoretical descriptions and observations. Resolving dis- 
crepancies by adding further details, without abandoning basic commitments, provides 
evidence that the theory accurately captures the fundamental physical relationships. The 
evidence is particularly strong when this process uncovers new features of the system that 
can be independently confirmed. 

Newtonian gravitational theory supported centuries of research in celestial mechanics 
in just this sense. With the benefit of gravitational theory, one could approach enormously 
complicated orbital motions, such as that of the Moon, via a series of idealizations that 
incorporate physical details thought to be relevant. Throughout the history of celestial 
mechanics, there have nearly always been systematic discrepancies between observations 
and trajectories calcuated based on all the relevant details known at a given time. Sub- 
sequent efforts then focused on identifying details left out of the calculation that might 
resolve the discrepancy. Leverrier’s inference that an undiscovered planet was the source 
of discrepancies in Uranus’s orbit is perhaps the most famous example of this type of rea- 
soning. But in most cases the physical source that was eventually identified was not as 
concrete as an additional planet; the secular acceleration of the Moon, for example, results 
from the slowing rotation of the Earth due to tidal friction. The new details are then incor- 
porated in a more elaborate model, and the search for discrepancies continues. By the early 
twentieth century, calculations of orbital motion included an enormous number of details. 
The theory was sufficiently precise to reveal very subtle discrepancies, such as systematic 
errors in determining sidereal time due to a periodic fluctuation in the Earth’s rotational 
speed. 

Smith (2014) convincingly argues that the success of this line of inquiry provides much 
stronger support for Newtonian gravity than is apparent if the theory is simply treated as 
making a series of successful predictions. Theoretical models of celestial motions had to 
be in place to even identify the small discrepancies that were the target of analysis, and 
in that sense the theory itself underwrites its detailed comparison with observation. The 
core commitments of the theory place stringent constraints on the kind of new physical 
details that can be introduced to account for discrepancies. Furthermore, these additions 
could usually be checked using methods that did not depend on gravity — as with the dis- 
covery of Neptune in the location predicted by Leverrier, or measurements of the periodic 
fluctuation in the Earth’s rotation, initially detected by astronomers, using atomic clocks. 
These independent checks on the details incorporated in ever more elaborate models sup- 
port the theory’s claim to have accurately identified the appropriate quantities and laws. It 
would be an enormous coincindence for a fundamentally incorrect theory to be so useful 
in discovering new features of the solar system. 
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These two historical cases illustrate strategies, arguably used throughout physics, to pro- 
vide an effective response to skepticism regarding theoretical knowledge. This skepticism 
is inspired by the apparent circularity of relying so heavily on the very theory in question to 
support detailed comparison with observations. When interpreted with the aid of the theory, 
the phenomena yield constraints on the parameters of the theory, and discrepancies that can 
be the target of further work; yet neither the constraints nor the discrepancies are readily 
available without the guidance provided by theory. How do we guard against accepting 
a theory that is self-certifying, sufficiently flexible to avoid refutation despite its flaws? 
In both examples above, the response to this worry relies upon using multiple, indepen- 
dent lines of evidence, while acknowledging the theory-dependence of the reasoning. The 
prosaic point underlying this response is that using multiple sources of information, depen- 
dent on theory in different ways and with different sources of systematic error, minimizes 
epistemic risk. This response shifts the burden of proof onto the skeptic: if the underlying 
theory were false, it would be an enormous coincidence for all of the multiple ways of 
measuring a parameter to coincide, or for new features added to resolve discrepancies to 
be independently confirmed. 

A second skeptical objection regards the nature of the claims supported by these argu- 
ments: can they be regarded as stable contributions to our knowledge of the natural world? 
Perrin made the case for atomism prior to the advent of quantum theory, and the reasoning 
in celestial mechanics described above precedes Einstein. Are these arguments undermined 
by quantum mechanics and general relativity, respectively? As a brief reply to this Kuh- 
nian worry, consider the nature of the claims that are supported by the arguments above. 
These are claims that specific law-like relations hold between different physical quantities 
within some domain. Perrin’s case depends upon the relations between atomic scale prop- 
erties and macroscopic, measurable properties. The development of celestial mechanics 
supports a variety of claims about what features of the solar system are relevant to plan- 
etary motions. In these two cases, the claims in question are arguably preserved through 
theory change, in the sense that there are lawlike relations in the successor theory which 
are approximated, within a restricted domain, by corresponding relations in the preced- 
ing theory.* This is true in spite of the dramatic conceptual differences between classical 
mechanics and quantum mechanics, and between Newtonian gravity and general relativity. 
As a consequence, the reasoning employed in arguing for the atomic hypothesis and in the 


3 Determining how to recover the preceding theory in an appropriate limit is often surprisingly subtle, and I do not 
have the space to explore the issue fully here. In the case of Perrin’s argument, for example, the central assump- 
tions of kinetic theory Perrin used in his study of Brownian motion are good approximations to a quantum 
statistical mechanical treatment, except in cases of low temperatures or high densities; a full discussion would 
consider the approximations involved in the other methods for measuring N as well. For celestial mechanics, 
the claims in question regard the impact of, for example, Neptune on Uranus’s orbit. The current practice of 
modeling the solar system using Newtonian physics with general relativistic corrections presupposes that the 
Newtonian description is a valid approximation. Finally, this claim regarding continuity does not require that 
the earlier theory provides a full account of the phenomena. Perrin did not have a complete theory of the nature 
of molecules, for example; he was well aware that the problem of specific heats identified by Maxwell had not 
been solved. 
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development of celestial mechanics is validated rather than undermined by the successor 
theories. 

There may be cases in which a new theory recovers only the predictions of an earlier 
theory, without establishing the validity of its evidential reasoning in this stronger sense. 
I do not intend to rule out that possibility by fiat; the evidence may simply unravel. But 
the burden of proof rests with the new theory to explain away the apparent successes of 
an old theory when the latter is not recovered as an approximation. In the cases described 
above, no one has imagined a credible alternative theory that matches the successes of the 
atomic hypothesis or Newtonian gravity without recovering aspects of these theories as 
limiting cases. This is the qualified sense in which theories supported by arguments like 
those described above constitute a stable contribution to our understanding of nature. 


10.3 The Standard Model and Inflation 


The standard model of cosmology (SMC) is based on bold extrapolations of theories that 
have been well tested by Earth-bound experiments and astronomical observations. The 
interpretation of cosmological data depends, to varying degrees, on a background cos- 
mological model, and hence assumes the validity of extrapolating general relativity to 
length scales roughly 14 orders of magnitude greater than those where the theory is sub- 
ject to high precision tests. The SMC describes the contents of the universe and their 
evolution based on the Standard Model of particle physics, supplemented with two dis- 
tinctive types of matter — dark matter and dark energy — that have so far only been detected 
due to their large-scale gravitational effects. Cosmological observations performed over 
the last few decades substantiate the enormous extrapolations and novel assumptions of 
the SMC. 

The development of a precise cosmological model compatible with the rich set of cos- 
mological data currently available is an impressive achievement. Yet cosmology clearly 
relies very heavily on theory; the cosmological parameters that have been the target of 
observational campaigns are only defined within a background cosmological model. The 
SMC includes several free parameters, such as the density parameters characterizing the 
abundance of different types of matter, each of which can be measured by a variety of dif- 
ferent types of observations. CMB observations, in particular, place powerful constraints 
on many cosmological parameters. (Inferences to parameter values from observations of 
the CMB typically require prior assumptions regarding the nature of the primordial power 
spectrum, and there are several parameter degeneracies that cannot be resolved based solely 
on CMB observations.) There is a variety of independent ways of measuring the cosmo- 
logical parameters that depend on different aspects of theory and have different sources of 


4 See Beringer et al. (2012) for a review of evidence bearing on the cosmological parameters. The total number of 
parameters used to specify a cosmological model varies in different studies, but typically five to ten fundamental 
parameters are used to determine the best fit to a given data set. (Specific models often require a variety of 
further “nuisance parameters” to account for astrophysical processes.) 
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observational error. For example, the abundance of deuterium produced during big bang 
nucleosynthesis depends sensitively on the baryon density. Nucleosynthesis is described 
using well-tested nuclear physics, and the light element abundances are frozen in within 
the “first three minutes”. The amplitudes of the acoustic peaks in the CMB angular power 
spectrum can be used to determine the baryon density at a later time (recombination, at 
t ~ 400,000 years), based on quite different theoretical assumptions and observational 
techniques. Current measurements fix the baryon density to an accuracy of | per cent, and 
the values determined by these two methods agree within observational error. This agree- 
ment (augmented by other consistent measurements) is an important consistency check for 
the SMC. 

The strongest case for accepting the SMC rests on the evidence in favor of the underly- 
ing physics in concert with the overdetermination of cosmological parameters. (See, e.g. 
Peebles et al. (2009), §5.4 for a brief discussion of tests of the SMC emphasizing the 
importance of independent measurements of the parameters.) The overdetermination argu- 
ment has a similar structure to Perrin’s argument for atomism described above. The case 
for the SMC does not yet have the diversity of independent lines of measurement that 
made Perrin’s case so powerful; there are unexplained discrepancies among some measure- 
ments; individual measurements are not as precise as those available in many other areas 
of physics; and there are theoretical loopholes related to each measurement. But the essen- 
tial epistemic point is the same: due to the diversity of measurements, it is unlikely that 
evaluation of the SMC has been entirely misguided due to incorrect theoretical assump- 
tions or systematic observational errors. Several lines of observational and theoretical work 
currently being pursued promise to substantially strengthen the evidential support for the 
SMC. 


Several of the cosmological parameters characterize the universe’s “initial” state.> The 
SMC describes the large-scale structure of the universe as a perturbed Friedmann— 
Lemaitre—Robertson—Walker (FLRW) model. The FLRW models are homogeneous and 
isotropic solutions of Einstein’s field equations (EFE). The models are topologically & x MR, 
visualizable as a “stack” of three-dimensional spatial surfaces (t) labeled by cosmic time 
t. The worldlines of “fundamental observers”, at rest with respect to matter, are orthogonal 
to these surfaces, and the cosmic time corresponds to their proper time. EFE simplify to 
two equations governing R(t), the spatial distance between fundamental observers. 

One cosmological parameter — the spatial curvature, 02, — characterizes which of the 
FLRW models best fits observations. It is constrained by observations to be very close to 
zero, corresponding to the “flat” FLRW model whose spatial hypersurfaces X(t) have zero 
curvature. For the flat model, the total energy density takes exactly the value (2 = 1) 


5 This is often taken to be the state as specified at the “boundary of the domain of applicability of classical 
GR” — e.g. at the Planck time, t ~ 107435. Appropriate initial data for EFE specify the full solution (for 
globally hyperbolic spacetimes), so the choice of a specific cosmic time at which to characterize the initial state 
is a matter of convention. But this conventional choice is significant if a given solution is treated as a perturbed 
FLRW solution, since dynamical evolution modifies the power spectrum of perturbations. 
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needed to counteract the initial velocity of expansion, R > 0 as t > 00. Q =: = where 
p is the mass-energy density and the critical density is defined as p. = & (H? - 4). His 


the Hubble “constant” (which in fact varies with cosmic time), defined as H = x and A 
is the cosmological constant. Other parameters characterize perturbations to the underly- 
ing FLRW model, which are fluctuations in mass density needed to provide the seeds for 
structure formation via gravitational clumping. If these fluctuations obey Gaussian statis- 
tics, they can be fully characterized in terms of a dimensionless power spectrum P(k). The 
power spectrum of the primeval mass distribution in the SMC takes the simple form of a 
power law, P(k) « k”s. This power spectrum is parametrized in the SMC by two num- 
bers. The first, the spectral index n;, is equal to unity if there is no preferred scale in the 
power spectrum; observations currently favor ns = 0.96, indicating a slight “blue tilt” in 
the power spectrum, with less power on smaller length scales. A second number is needed 
to specify the amplitude of the perturbations. (There are a few different ways of doing so. 
For example, og is the mass variance of the primordial distribution within a given radius 
(defined in terms of another parameter, the distance scale h: 8h~! ~ 11 Mpc, given current 
estimates of ), projected forward to the current time using linear perturbation theory). 

The initial state required by the SMC has three particularly puzzling features. First, it 
is surprising that the simple, uniform FLRW models can be used at all in describing the 
early universe. These models have a finite horizon distance, much smaller than the scales 
at which we observe the CMB.° The observed isotropy of the early universe — revealed 
most strikingly by the temperature of the CMB — supports the use of the FLRW models; 
yet these observations cover thousands of causally disjoint regions. Why did the universe 
start off with such a glorious pre-established harmony? Second, an FLRW model close to 
the “flat” model, with nearly critical density at some specified early time is driven rapidly 
away from critical density under FLRW dynamics; the flat model is an unstable fixed point 
under dynamical evolution. In order for observations at late times to be compatible with a 
flat model, the initial state has to be very close to the flat model (or, equivalently, very close 
to critical density, 2 = 1). (It follows from the FLRW dynamics that gu x R3Y-2(f), 
y > 2/3 if the strong energy condition holds, and in that case an initial value of 2 not equal 
to 1 is driven rapidly away from 1. Observational constraints on Q (fo) can be extrapolated 
back to a constraint on the total energy density of the Planck time, namely |Q(t)) — 1| < 
10-*?.) 

Finally, the perturbations to the flat FLRW model postulated in the SMC are challenging 
to explain physically. It is not clear what physical processes could account for the ampli- 
tude of the perturbations. Suppose, for example, that one takes the “initial” perturbation 
spectrum to be imprinted at t; * 107*>s. Observations imply that at this time the initial 
perturbations would be far, far smaller than thermal fluctuations. (Blau and Guth (1987) 


6 A horizon is the surface in a time slice tg separating particles moving along geodesics that could have been 
observed from a worldline y by fg from those which could not. For a radiation-dominated FLRW model, 
the expression for horizon distance dy is finite; the horizon distance at decoupling corresponds to an angular 
separation of © 1° on the surface of last scattering. 
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calculate that observations imply a density contrast = =~ 10779 at #;, nine orders of mag- 
nitude smaller than thermal fluctuations.) In addition, the perturbations of the appropriate 
scale to eventually form galaxies would, in the early universe, be coherent at scales that 
seem to conflict with the causal structure of the FLRW models. A simple scaling argument 
shows that the wavelength 4 of a given perturbation “crosses the horizon” with expan- 
sion, at the time when A ~ H—! (where H7! is the Hubble radius). Assuming that the 
perturbation spectrum is scale invariant, and for a simple model with R(t) « ft” (n < 1) 
the wavelength of a given mode simply scales with the expansion 2 « f”. H~! scales as 
H~! « t; as a result, the Hubble radius “crosses over” perturbation modes with expan- 
sion. Prior to horizon-crossing, the perturbation would have been coherent on length scales 
greater than the Hubble radius. The Hubble radius is typically regarded as marking the 
limit of causal interactions, and as a result it is puzzling how normal physics operating in 
the early universe could produce coherent perturbations at such scales.’ 


Since the late 1970s, cosmologists have sought a physical understanding of how such an 
unusual “initial state” came about. On a more phenomenological approach, the gravita- 
tional degrees of freedom of the initial state could be chosen to fit with later observations. 
Inflation in effect replaces such a specification with a hypothesis regarding the initial con- 
ditions and dynamical evolution of a proposed “inflaton” field (or fields). In the simplest 
inflationary models, a single field ¢, trapped in a false vacuum state, triggers a phase 
of exponential expansion. If the inflaton field ¢@ is homogeneous, then the false vacuum 
state contributes an effective cosmological constant to EFE, leading to quasi-de Sitter 
expansion.® 

The resulting spurt of inflationary expansion can provide a simple physical account 
of the SMC’s starting point, as emphasized with sufficient clarity to launch a field by 
Guth (1981). Inflationary expansion stretches the horizon length; for N “e-foldings” of 
expansion the horizon distance d;, is multiplied by e’. For N > 65 the horizon distance, 
while still finite, encompasses the observed universe. The observed universe could then 
have evolved from a single pre-inflationary patch, rather than encompassing an enormous 
number of causally disjoint regions. (This pre-inflationary patch is larger than the Hub- 
ble radius (Vachaspati and Trodden, 2000), however, so inflation does not dispense with 
pre-established harmony.) During an inflationary phase the density parameter Q is driven 
towards one. (This is apparent from the equation above, given that y = O during inflation.) 
An inflationary stage long enough to solve the horizon problem drives a large range of 


x 


The Hubble radius is defined in terms of the instantaneous expansion rate R(t), by contrast with the horizon 
distance, which depends upon the expansion history over some interval (the particle horizon, e.g. depends on 
the full expansion history). For radiation or matter-dominated solutions, the two quantities have the same order 
of magnitude. 


8 The stress-energy tensor is given by T,, = VabVpd — i Sab (s“VeVad - vid)). where V(@) is the effective 


potential; for a homogeneous state, such that V(¢) >> g°"VeVad, Tah © —V()gap, leading to R(t) x e! 
with & 2 B2V@) 
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pre-inflationary values of (2(t;) close enough to | by the end of inflation to be compatible 
with observations. 

The most remarkable feature of inflation, widely recognized shortly after Guth’s paper, 
was its ability to generate a nearly scale-invariant spectrum of density perturbations with 
correlations on length scales larger than the Hubble radius. Inflation produces density per- 
turbations by amplifying vacuum fluctuations in a scalar field @, with characteristic features 
due to the scaling behavior of the field through an inflationary phase. Start with a mass- 
less, minimally coupled scalar field @ evolving in a background FLRW model. The Fourier 
modes ¢x of linear perturbations to a background solution are uncoupled, with evolution 
like that of a damped harmonic oscillator. For modes such that é << H, the damping 
term is negligible, whereas those with é >> H evolve like an over-damped oscillator and 
“freeze in” with a fixed amplitude. The inflationary account runs very roughly as follows. 
(This behavior follows from the Klein—Gordon equation in an FLRW spacetime, consider- 
ing linearized perturbations around a background solution; see Mukhanov et al. (1992) for 
a comprehensive review of the evolution of perturbations through inflation, or Liddle and 
Lyth (2000) for a textbook treatment). Prior to inflation one assumes a vacuum state, i.e. 
the modes ¢, are initially in their ground state. For is << H the modes evolve adiabati- 
cally, remaining in their ground states. This account is not sensitive to exactly when a given 
mode is assumed to be born in its ground state. During inflation the modes scale with the 
exponential expansion whereas H is approximately constant. Due to this scaling behav- 
ior, modes will reach the horizon scale f x H — “horizon exit”. The damping term is 
then no longer negligible and the modes “freeze in” as they cross the horizon. Modes then 
“re-enter” the horizon after inflation has ended, because in standard FLRW expansion the 
scaling behavior is reversed. Finally, these modes are treated as classical density perturba- 
tions upon re-entering the horizon. (This is a quantum to classical transition; whether it can 
be justified by appeals to squeezing of the quantum state and decoherence is contentious.) 
This evolution leads to a nearly scale invariant spectrum, with the amplitude of the per- 
tubations fixed by the energy scale of inflation at horizon exit (as discussed below). (The 
spectrum is not exactly scale invariant because the Hubble constant is not truly constant 
throughout inflation.) 

To provide an account of the SMC’s initial state, the inflationary phase has to be fol- 
lowed by a stage called “re-heating”. Any matter or radiation present prior to inflation is 
rapidly diluted away during inflationary expansion, leaving a universe that is essentially 
empty except for the inflaton field. Reheating is required to fill the universe with matter 
and radiation, with temperature and densities appropriate for subsequent evolution within 
the standard big bang model. 


9 This is sometimes regarded as a successful prediction of inflation, since this feature of inflation was initially 
not known to many researchers (despite early results, including Mukhanov and Chibisov, 1981; Starobinsky, 
1979, 1980). Yet the initial prediction based on specific models under discussion (with inflation driven by a 
Higgs field in a grand unified theory) was incorrect, as the amplitude of pertubations was far too large. Insuring 
the correct amplitude leads to one of the “fine-tuning” problems of inflation, since the coupling of the scalar 
field driving inflation has to be very small. 
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Inflation provides a physical account of otherwise puzzling features of the starting point 
for the SMC. This is often described as solving “fine-tuning” problems of the SMC. Imag- 
ine choosing a cosmological model at random from among the space of solutions of EFE. 
Even without a well-defined measure on this space of solutions, it seems obvious that an 
FLRW model (or a perturbed FLRW model) must be an incredibly “improbable” choice. 
According to the SMC alone, what we observe is incredibly improbable; according to 
the SMC plus inflation, on the other hand, what we observe is to be expected, because 
“generic” pre-inflationary states lead to an appropriate starting point for the SMC. There 
are several objections to this line of argument, some of which go back to an incisive early 
criticism by Penrose.!° Perhaps the most fundamental objection regards the starting point 
for the argument: why should we treat the initial state as “generic”, a “random choice” from 
among all possible states? (Possible according to which theory?) It is also not clear that 
inflation succeeds by its own lights: Penrose, in particular, argued that a pre-inflationary 
patch with an appropriate state to trigger the onset of inflation should be less likely than an 
initial state for the SMC (without inflation). I will not explore these issues further here, in 
part because many proponents of inflation apparently regard the emphasis on fine-tuning 
as part of the initial motivation for inflation that can now be replaced with a more powerful 
empirical argument (see, e.g. Liddle and Lyth 2000, p. 5), to which I now turn. 


10.4 Assessing Inflation 


Inflation provides a promising account of the origins of the initial state for the SMC, and 
at the same time opens up the prospect of using observations of the CMB and large scale 
structure to constrain physics at an energy scale of ~ 10!5 — 10! GeV. Unlike other com- 
peting theories, it has not been ruled out as the observational picture of the early universe 
has come into sharper focus over the last 30 years. Observations have led to a remarkably 
simple picture of the state of the early universe, which is well described by a flat FLRW 
model, with Gaussian, adiabatic, linear, nearly scale invariant density pertubations. Infla- 
tion generates primordial fluctuations in the very early universe, at length scales larger 
than the Hubble radius. As they cross the Hubble radius, they set up coherent oscillations 
leading to acoustic peaks in the CMB power spectrum. Observations of acoustic peaks sup- 
port the primordial nature of the fluctuations, contrasting with predictions from competing 
models of structure formation based on active sources for fluctuations (such as topological 
defects). (See, e.g. Durrer et al. (2002) for a review of structure formation via topological 
defects, and its contrasting predictions for CMB anisotropies.) That inflation is compatible 
with this observational picture of the early universe is an important success.'! Does this 
amount to mere compatibility with the data, or does inflation fulfill its promise of providing 


10 See Penrose 2004, Chapter 28 for a recent exposition of the arguments he first made in the early 1980s; see also 
Albrecht (2004); Earman and Mosterin (1999); Carroll and Tam (2010); Gibbons and Turok (2008); Hollands 
and Wald (2002b) for related discussions. 

11 Here I will not address other conceptual and theoretical problems related to inflation, discussed in, e.g. 
Brandenberger (2014); Earman and Mosterin (1999); Hollands and Wald (2002b); Turok (2002). 
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a physical understanding of the early state? Here I will briefly assess challenges to provid- 
ing stronger evidence in favor of inflation, based on following the strategies described 
above. 

The inflaton is typically treated as a new field to be added to the Standard Model of par- 
ticle physics. Michael Turner called inflation a “paradigm without a theory” to emphasize 
the resulting flexibility of inflation. A bewildering variety of different inflationary models 
has been proposed, so many that theorists complain of the difficulty in finding an unused 
name for a new model. Many of the models can be characterized in terms of the Lagrangian 
proposed for the inflaton field: 


1 
L= — 58 Dah dng — V(o) + Li(¢, Aa, W,---), (10.1) 


where V(¢) is the effective potential, and £; is an interaction term, specifying interactions 
with other fields in the Standard Model. Assuming that inflation is driven by a single field 
with a Lagrangian with this form already reflects some simplifications. Inflationary mod- 
els with multiple scalar fields have been developed, motivated by proposals in high-energy 
physics that include many light scalar fields expected to be dynamically relevant in the 
early universe. But Planck observations support restricting attention to simple single-field 
models. Planck 2015 data provides strong evidence that the perturbations are adiabatic, 
which is compatible with simple single-field models; the failure to detect non-Gaussianities 
further supports the use of single-field models, and the choice of a standard kinetic term 
(the first term) in the Lagrangian (see, e.g. the discussions in Ade et al. (2015), §10, and 
Martin (2015)). A model from this class is characterized by a choice of the effective poten- 
tial V(@) and interaction term £7, along with assumptions regarding initial conditions for 
the field. 

Observations of the CMB and large scale structure constrain the Lagrangian in two main 
ways. The primordial fluctuations place constraints on the effective potential well before 
the end of the inflationary phase. Inflation generates scalar and tensor perturbations whose 
physical properties depend on the features of the effective potential V(@) at horizon exit, 
with is = H. Perturbations relevant to CMB observations typically crossed the horizon at 
= 60 e-foldings before the end of inflation, whereas those that are re-entering the hori- 
zon now were produced a few e-foldings later. (The calculation is model-dependent and 
depends on assumptions regarding the reheating temperature; this estimate holds for a 
variety of slow-roll models with plausible further assumptions.) The features of scalar 
and tensor perturbations amplified through the inflationary phase can be described, in 
some cases, with equations relating the perturbation spectra to the value of V(@) and its 
derivatives (at the scale when the perturbations crossed the horizon). Equations have been 
derived for models satisfying the slow-roll approximation, although it is not possible in 
general to calculate the perturbation spectra for an arbitrarily chosen V(@). “Slow-roll” 
models feature a flat effective potential, such that (roughly) V’,V” << _ V, leading to a 
long inflationary phase, sufficient to solve the horizon and flatness problems. (Here ' is 

d 


the derivative d6° The slow-roll conditions are constraints on V’, V” which insure that the 
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damping term (3H@) dominates over ¢ in the equation of motion for the inflation field: 
¢@ + 3Ho + V'() = 0.) For these models there are simple expressions for the amplitude, 
spectral index, and “running” of the spectral index for both scalar and tensor perturbations 
in terms of V, V’, V”. The scalar and tensor perturbation spectra are not independent, and 
a consistency relation, relating the spectral index of the tensor perturbations to the ratio of 
amplitudes of scalar and tensor perturbations, r, can be obtained by solving to eliminate 
V. (More generally, in a perturbative treatment there is a hierarchy of consistency rela- 
tions. There are a few different parametrizations used in relating the effective potential to 
observable features of the perturbations, including slow-roll parameters and Hubble flow 
parameters.) 

A successful account of reheating depends on a different part of V(#), along with the 
interaction term £;. Early accounts of inflation treated reheating as occurring when the 
inflaton field oscillated near the true minima of V(¢), assumed to be much steeper than the 
flat plateau needed for slow-roll, transferring energy to other particle species. Subsequent 
work has focused on energy transfer from the inflaton field to other particle species via 
coherent oscillations with parametric resonance. Observational constraints on the details 
of reheating are weaker than those related to the generation of primordial fluctuations (see, 
e.g. Martin et al., 2015). 

There are several different approaches to reconstructing the inflaton potential from 
observations, and evaluating competing inflationary models (see, e.g. Lidsey ef al., 1997; 
Martin et al., 2013). Given the wealth of observational data already available, upcoming 
observations, and the large variety of inflationary models, these techniques naturally focus 
on determining which models best fit the data. Martin et al. (2013), for example, adopt 
a Bayesian approach to analyze 193 single-field slow-roll inflationary models, conclud- 
ing that nine models with “plateau’-shaped potentials are preferred. The method relies 
on statistical tools optimized for determining the best model given the inherent noise and 
uncertainty of observational data. The ranking weighs the closeness of fit to the data a given 
model achieves against the complexity of the model, to avoid the pitfall of overfitting the 
data, but it is not designed to assess physical plausibility of a given model. Although this 
issue deserves further scrutiny, the epistemic point addressed in the model selection liter- 
ature is distinct from the question addressed by the historical strategies discussed above. 
Finding the best fit model, granting the general framework used for interpreting the data, 
is not the same as evaluating the framework itself, although obviously the existence of a 
successful model — or lack of one — is relevant to this second task. The historical strategies 
aim to assess the validity of the underlying framework, to guard against being system- 
atically misled by accepting an incorrect framework that nevertheless accommodates the 
data. 


Turning to the first strategy discussed above, observations do provide independent con- 
straints on the underlying inflationary mechanism for amplifying perturbations. A scale- 
invariant spectrum of scalar perturbations was proposed well before inflation on general 
grounds (Harrison, 1970; Peebles and Yu, 1970; Zel’dovich, 1972). But there is not a 
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similar argument in favor of a scale-invariant tensor perturbation spectrum, or any theory- 
independent reason to expect the two spectra to be linked as reflected in the consistency 
relation. Furthermore, measurements of the tensor perturbations directly constrain V(@) 
at the point where a given length scale crossed the horizon. Measuring the tensor pertur- 
bation spectrum at different length scales, if it were feasible observationally, would give 
a direct reconstruction of V(#).!* Detection of CMB B-mode polarization, leading to a 
measurement of r, along with a measurement of the spectral index for tensor perturbations, 
n;, directly tests the consistency relation. Measuring r is the target of a number of post- 
Planck missions, but the follow-up measurement of n; is particularly challenging for small 
values of r. The possibility of nailing down the inflationary mechanism for amplifying 
perturbations in this fashion is certainly one of inflation’s most appealing features. 

There are, however, several contrasts with overdetermination arguments such as Perrin’s. 
The first contrast regards the target of the argument, the Lagrangian for the inflaton field 
— and in particular the function V(@) and the various couplings included in the interaction 
term £;. It is obviously much more challenging to provide a compelling overdetermination 
argument for the Lagrangian as opposed to a single number N. Furthermore, the existing 
observational constraints apply to two distinct dynamical regimes of the inflaton’s evolu- 
tion: the amplification of quantum fluctuations at horizon crossing, ~ 60 e-folds before the 
end of inflation, compared with the decay of the inflaton and reheating at the very end of the 
inflationary phase. Inflationary models are a package deal rather than a single ticket: with- 
out theoretical constraints on the properties of the inflaton field, one can choose the shape 
of the potential relevant to amplification of perturbations, and then separately choose the 
shape of the potential near the true minimum and couplings in the interaction term. As long 
as this remains a relatively free choice, with weak constraints imposed in either direction, 
success in these two distinct dynamical regimes does not provide overlapping constraints. 

The evidential situation changes substantially for an inflaton Lagrangian that is identi- 
fied within a specific particle physics model. In such a case, the parameters appearing in 
the Lagrangian are constrained by cosmological data related to the details of inflation, 
as well as whatever experimental data are relevant to the particle physics model. This 
would provide a compelling set of independent constraints. Furthermore, since the infla- 
ton model would be a single ticket item in this case, different cosmological measurements 
provide overlapping constraints on the Lagrangian. Yet the promise of directly identify- 
ing a canonical candidate for the inflaton field has not been fulfilled; instead, there has 
been a proliferation of toy models of inflation. Constructing physically plausible mod- 
els for the inflaton has been difficult because V(@) has to be very flat. The prospects for 
re-establishing a tighter link through direct experimental study are extremely bleak: the 


12 This may even include constraints based on solar-system scale gravitational wave observatories. Boyle et al. 
(2014) argue that if there is a tilt in the tensor perturbation spectrum, as suggested by the BICEP2 initial 
results, the proposed Big Bang Observatory would provide a second set of measurements at a scale 10!8 
smaller than those relevant for the CMB, providing an enormous lever arm for more precise tests of inflation. 
See also Alvarez et al. (2014) for an overview of tests of inflation based on large-scale structure. 
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properties required for an inflaton field in a slow-roll model insure that it can only feasibly 
be studied observationally through its impact on the early universe. 

Even without resolving the identity of the inflaton, the case for inflation can be strength- 
ened by imposing other constraints on the class of allowed models. In practice this is 
reflected in assessments of the plausibility of different inflationary models, given assump- 
tions about physics at the appropriate energy scale. There have also been proposals to 
characterize how inflationary predictions depend upon the amount of fine-tuning of the 
potential, which lead to constraints on the parameter ranges compatible with less finely- 
tuned potentials. (Boyle et al. (2006), for example, characterize fine-tuning in terms of the 
number of zeroes appearing in the slow-roll parameter 7 and its first derivative (with respect 
to the number of e-folds), which is intended as a measure of the number of “features” added 
to the effective potential; inflaton models with little fine-tuning in this sense favor specific 
parameter ranges for ns, r.) On either of these approaches, considerable weight is put on the 
further constraints imposed in the name of plausibility or simplicity. Past debates regarding 
the viability of different types of models make the challenges to achieving consensus on 
these questions clear.!? 

A final contrast regards the assessment of alternatives. Perrin argued that the agreeing 
measurements of V would be an enormous coincidence if the atomic hypothesis were false. 
How likely is the simple early state required by the SMC, if inflation did not occur? Turok 
(2002), for example, remarks that “The success of the simplest inflationary models is per- 
haps more of a success for simplicity than it is for inflation” (p. 3458, emphasis original). 
Any early universe theory that generates a nearly scale invariant spectrum of primordial 
fluctuations will match many of inflation’s successes. Theorists have discovered several 
different ways of generating such fluctuations, ranging from alternative ways of modifying 
causal structure (varying speed of light theories) to “bounce” models, which replace the 
Big Bang singularity with a Big Bounce. They treat the primordial fluctuations as gen- 
erated prior to the bounce, although details of implementation, and the physics used to 
construct the model, differ substantially among the different models. (See Brandenberger 
(2014) for an overview of the matter bounce and string gas models, and Lehners (2008) 
for a review of ekpyrotic and cyclic models.) There is no reason to expect a consistency 
relation between tensor and scalar perturbations in a “simple” initial state, and this rela- 
tion also discriminates between inflation and other models for the generation of primordial 
fluctuations. Further observational work, in particular detection of primordial gravitational 
waves and tests of the consistency relation, would lead to a much stronger case that the 
observed properties of the early universe would be an enormous coincidence if inflation 
were false. 


13 For example, inflation is commonly taken to predict a flat universe with Qo = 1. There were heated debates in 
the mid to late 1990s regarding so-called “open inflation” models that yield a value of Qo ~ 0.2 — 0.3, which 
was at that time favored by observations. Insofar as these were regarded as plausible models, inflation no 
longer predicts flatness, and the value of Qg instead provides a constraint on the parameter space for models 
(see, e.g. the discussions in Turok, 1997). 
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Regarding the second strategy, I am unaware of any case in which inflation has been 
used to uncover a new feature of the early universe that can be independently checked. 
There is a variety of ways in which inflationary model-building has become more sophis- 
ticated, with a much clearer understanding of what needs to be in place in a full account 
of inflation. There is no shortage of theoretical innovation in building inflationary models. 
Inflation has also guided observational work by identifying features that can be used to 
constrain inflationary models and contrast inflation with competing theories. But the ques- 
tion is whether inflation has allowed cosmologists to identify robust physical features of 
the early universe that can be tested in ways that do not assume inflationary theory itself. 
There are no analogs, as far as I am aware, of adding a new physical feature as part of the 
model that can, like the existence of Neptune, be easily checked by other means. This is in 
part due to the observational inaccessibility of the early universe, but also to the lack of a 
canonical choice of the inflaton field. Given a fixed choice for the inflaton field, discrep- 
ancies with observations would force theorists to elaborate the model, possibly identifying 
new features of the early universe in the process. At present the choice of inflationary 
models is too flexible to support this kind of approach. 


Above I emphasized the need for multiple, independent lines of evidence, in order to mit- 
igate the theory-dependence of evidential reasoning. The challenges to pursuing the two 
historical strategies in cosmology both reflect our lack of accessibility to the early universe 
and to the energy scales of inflation. The observed state of the universe is compatible with 
inflationary models, but we have not yet developed a more detailed account of how infla- 
tion occurred. In the historical cases described above, it was ultimately the development of 
detailed accounts of the nature of atoms, and of the motions in the solar system and their 
causes, that provided confidence in the theories employed along the way. The alternative 
to regarding these theories as stable contributions to our knowledge of the natural world is 
to accept that, for example, measurements of the Earth’s slowing rotation by atomic clocks 
simply happened to agree with measurements of the Moon’s motion, by an astronomical 
coincidence. A successful, detailed account of inflation, going beyond the initial step of 
using CMB data to constrain the inflaton Lagrangian, could support an argument of this 
sort. The challenge to taking the next steps is our lack of access to energy scales associated 
with the inflaton, and to specific quantities that discriminate among models. 

The challenges to the observational program of further constraining the inflaton field 
have little to do directly with the distinctive features of cosmology, such as the uniqueness 
of the universe. Neither of the strategies described above require that the system under 
study can be experimentally manipulated. It is also not essential to consider a repeatable 
phenomenon, with multiple instances subject to study. The inability to conduct relevant 
experiments, and lack of multiple instances, are often taken to distinguish cosmology from 
other areas of inquiry, leading to limits on what can be established (e.g. Munitz, 1962). 
To make the contrast between limitations that are inherent to cosmology and problems of 
accessibility more vivid, imagine that an alien civilization provided us with an accelerator 
able to probe physics at 10'°GeV. Access to the physics at this energy scale, to determine 
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the properties of the inflaton (if it exists), would enable thorough development and testing 
of a detailed account of the universe’s early history. 

There is a more interesting challenge in cosmology regarding how to deal with ini- 
tial conditions, and potential trade-offs between assumptions regarding initial conditions 
and dynamics (cf. Smolin, 2015). Early discussions of inflation often emphasized its abil- 
ity to “wash away” dependence on the pre-inflationary state of the universe, doing away 
with the need for assumptions about the initial state. (Collins and Stewart (1971) noted 
in response to a precursor to inflation that dynamics cannot completely “wash away” the 
initial state, however. Given fairly weak assumptions about the dynamics, it follows from 
standard existence and uniqueness theorems for differential equations that one can always 
find a pre-inflationary state that will lead to any given post-inflationary value of Qo (for 
example).) But it is clear that inflation requires assumptions regarding the initial state of 
the inflaton field (homogeneous, with an appropriate value of V(@), in a spacetime region 
larger than the Hubble radius), along with an appropriate form of the potential V(#). These 
are sometimes called inflation’s fine-tuning problems. Assumptions about what is a plausi- 
ble initial state are also relevant to assessing the account of structure formation. Hollands 
and Wald (2002b) construct a simple model that produces a similar spectrum of density 
perturbations without an inflationary phase based on a different Ansatz for the initial con- 
ditions. Their model describes quantized sound waves in a perfect fluid, with the same 
“overdamping” of modes with 4 >> A™! as in inflation. By contrast with inflation, there 
is no horizon crossing, so it is significant precisely when the modes are taken to be in a vac- 
uum state.!+ The fine-tuning problems of inflation are often thought to be resolved within 
the context of eternal inflation, to which I now turn. 


Many cosmologists hold that inflation is “generically eternal’, in the sense that inflation 
produces a multiverse consisting of “pocket universes”, where inflation has ended, even as 
inflationary expansion continues elsewhere. (See, e.g., Aguirre (2007) for an introduction 
to eternal inflation.) The mechanism leading to this multiverse structure is also assumed 
to lead to some variation in the physical parameters among the different pocket universes. 
The solution to the fine-tuning problems of the original inflationary models is based on 
invoking an anthropic selection effect: pocket universes featuring observers will be ones in 
which various necessary preconditions for the existence of life like us hold. These precon- 
ditions plausibly include the existence of structures like galaxies; the formation of galaxies 
depends upon the presence of small fluctuations in an expanding FLRW model; and the 
small fluctuations themselves are ultimately traced back to an initial state for @ and form 
of the effective potential V(@) appropriate to trigger an inflationary phase. 


14 Hollands and Wald (2002a) propose to take the modes to be “born” in a ground state when their proper 
wavelength is equal to the Planck scale, motivated by considerations of the domain of applicability of semi- 
classical quantum gravity. The modes will be “born” at different times, continually “emerging out of the 
spacetime foam”, with the modes relevant to large-scale structure born at times much earlier than the Planck 
time. By way of contrast, in the usual approach the modes at all length scales are specified to be in a ground 
state at a particular time, such as the Planck time. But the precise time at which one stipulates the field modes 
to be in a vacuum state does not matter given that the sub-horizon modes evolve adiabatically. 
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Accepting eternal inflation undermines the observational program of attempting to con- 
strain and fix the features of the inflaton field, in two senses.!> First, the appeal to anthropic 
selection undercuts the motivation for introducing a specific dynamical mechanism for 
generating a multiverse. The anthropic argument is intended to counter the objection that 
¢ and V(@) probably do not have appropriate properties to initiate inflation. While that 
may be true in the multiverse as a whole, it is, the argument goes, not the case in the 
habitable pocket universes, which are expected to have undergone inflation in order to 
produce galaxies (for example). However, the argument works just as well with other pro- 
posed ensembles, as long as the observed universe is compatible with the underlying laws. 
Rather than the inflationary multiverse, why not simply consider a relativistic cosmolog- 
ical model with infinite spatial sections, and some variation among different regions? By 
parity of reasoning, even if a region with properties like our observed Hubble volume is 
incredibly improbable in general, it may be highly probable within the anthropic subset. I 
do not see a plausible way to refine the argument to draw a distinction between these two 
cases, so that one can preserve the original motivations for inflation while accepting eternal 
inflation. 

The second challenge raised by eternal inflation regards the prospects for using evidence 
to constrain theory along the lines outlined above. Briefly put, the two strategies above 
both rely on the exactness of theory in order to develop strong evidence — either in the 
form of connections among different types of phenomena, or in the form of rigidity as a 
theory is extended to give a more detailed account. Eternal inflation is anything but exact. 
Deriving “predictions” from eternal inflation requires specifying the ensemble of pocket 
universes under consideration; a measure over this ensemble that is well motivated; and a 
specification of the subset of the ensemble within which observers can be located. Each of 
these raises a number of technical and conceptual problems. But even if these are resolved, 
there are then several substantive auxiliary assumptions standing between the predictions 
of eternal inflation and the comparison with observations. Rather than using observations 
to directly constrain and probe the physics behind the formation of structure, we would 
instead be delimiting the anthropic subset of the multiverse. 


10.5 Conclusion 


Science often proceeds by making substantial theoretical assumptions that allow us to 
extend our reach into a new domain. My approach above has been to focus on asking 
how evidence can accumulate in favor of these assumptions as the research based on them 
advances. In many historical cases, subsequent research has established that a theory has to 
be accepted, at least as a good approximation, with the only alternative being to accept an 
enormously implausible set of coincidences. Based on the methodological insights gleaned 
from these historical cases, I have argued that the main problems with establishing inflation 


15 See Ijjas et al. (2013b, 2014) for an assessment of related problems, along with the response by Guth et al. 
(2014). I discuss these issues at greater length in Smeenk (2014). 
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with the same degree of confidence stem from our lack of independent lines of access. In 
addition, eternal inflation undercuts the observational program devoted to constraining the 
inflaton field. 
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Why Boltzmann Brains do not Fluctuate into Existence 
from the de Sitter Vacuum 


KIMBERLY K. BODDY, SEAN M. CARROLL AND JASON POLLACK 


11.1 Introduction 


The Boltzmann Brain problem [2, 6, 7] is a novel constraint on cosmological models. It 
arises when there are thought to be very large spacetime volumes in a de Sitter vacuum 
state — empty space with a positive cosmological constant A. This could apply to theories 
of eternal inflation and the cosmological multiverse, but it also arises in the future of our 
current universe, according to the popular ACDM cosmology. 

Observers in de Sitter are surrounded by a cosmological horizon at a distance R = H~!, 
where H = ./A/3 is the (fixed) Hubble parameter. Such horizons are associated with a 
finite entropy S = 32/GA and temperature T = H/2z [11]. With a finite temperature and 
spatial volume, and an infinite amount of time, it has been suggested that we should expect 
quantum/thermal fluctuations into all allowed configurations. In this context, any particular 
kind of anthropically interesting situation (such as an individual conscious “brain”, or the 
current macrostate of the room you are now in, or the Earth and its biosphere) will fluctu- 
ate into existence many times. With very high probability, when we conditionalize on the 
appearance of some local situation, the rest of the state of the universe will be generic — 
close to thermal equilibrium — and both the past and future will be higher-entropy states.! 
These features are wildly different from the universe we think we actually live in, featuring 
a low-entropy Big Bang state approximately 13.8 billion years ago. Therefore, the story 
goes, our universe must not be one with sufficiently large de Sitter regions to allow such 
fluctuations to dominate. 

In this chapter we summarize and amplify a previous paper in which we argued that 
the Boltzmann Brain problem is less generic (and therefore more easily avoided) than 
is often supposed [5]. Our argument involves a more precise understanding of the infor- 
mal notion of “quantum fluctuations”. This term is used in ambiguous ways when we 
are talking about conventional laboratory physics: it might refer to Boltzmann (thermal) 


! The real problem with an eternally fluctuating universe is not that it would look very different from ours. It is 
that it would contain observers who see exactly what we see, but have no reason to take any of their observations 
as reliable indicators of external reality, since the mental impressions of those observations are likely to have 
randomly fluctuated into their brains. 
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fluctuations, where the microstate of the system is truly time-dependent; or measurement- 
induced fluctuations, where repeated observation of a quantum system returns stochastic 
results; or time-independent “fluctuations” of particles or fields in the vacuum, which are 
really just a poetic way of distinguishing between quantum and classical behavior. In the 
de Sitter vacuum, which is a stationary state, there are time-independent vacuum fluctua- 
tions, but there are no dynamical processes that could actually bring Boltzmann Brains (or 
related configurations) into existence. Working in the Everett (Many-Worlds) formulation 
of quantum mechanics, we argue that the kinds of events where something may be said to 
“fluctuate into existence” are dynamical processes in which branches of the wave function 
decohere. Having a non-zero amplitude for a certain quantum event should not be directly 
associated with the probability that such an event will happen; things only happen when 
the wave function branches into worlds in which those things occur. 

Given this understanding, the Boltzmann Brain problem is avoided when horizon-sized 
patches of the universe evolve toward the de Sitter vacuum state. This is generically to 
be expected in the context of quantum field theory in curved spacetime, according to the 
cosmic no-hair theorems [16, 19, 28]. It would not be expected in the context of horizon 
complementarity in a theory with a true de Sitter minimum; there, the whole theory is 
described by a finite-dimensional Hilbert space, and we should expect Poincaré recurrences 
and Boltzmann fluctuations [2-4, 6, 7, 21, 26, 27]. Such theories do have a Boltzmann 
Brain problem. However, if we consider a A > 0 false-vacuum state in a theory where 
there is also a A = 0 state, the theory as a whole has an infinite-dimensional Hilbert space. 
Then we would expect the false-vacuum state, considered by itself, to dissipate toward a 
quiescent state, free of dynamical fluctuations. Hence, the Boltzmann Brain problem is 
easier to avoid than conventionally imagined. 

Our argument raises an interesting issue concerning what “really happens” in the 
Everettian wave function. We briefly discuss this issue in Section 11.4. 


11.2 Quantum Fluctuations in Everettian Quantum Mechanics 


The existence of Boltzmann Brain fluctuations is a rare example of a question whose 
answer depends sensitively on one’s preferred formulation of quantum theory. Here we 
investigate the issue in the context of Everettian quantum mechanics (EQM) [8, 24, 29]. 
The underlying ontology of EQM is extremely simple, coming down to two postulates: 


1. The world is fully represented by quantum states |) that are elements of a Hilbert 
space H. 
2. States evolve with time according to the Schrédinger equation, 


Alw(O) = ile), (11.1) 
for some self-adjoint Hamiltonian operator H. 


The challenge, of course, is matching this austere framework onto empirical reality. In 
EQM, our task is to derive, rather than posit, features such as the apparent collapse of 
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the wave function (even though the true dynamics are completely unitary) and the Born 
Rule for probabilities (even though the full theory is completely deterministic). We will 
not delve into these issues here, but only emphasize that in EQM the quantum state and its 
unitary evolution are assumed to give a complete description of reality. No other physical 
variables or measurement postulates are required. 

Within this framework, consider a toy system such as a one-dimensional simple har- 
monic oscillator with potential V(x) = 5arx. Its ground state is a Gaussian wave function 
whose only time-dependence is an overall phase factor, 


w(x, t) oe Eot Fox (11.2) 


The phase is of course physically irrelevant; one way of seeing that is to note that equivalent 
information is encoded in the pure-state density operator, 


p(x, t) = |W, o)(w, | = |W, 0))(W@, 0), (11.3) 


which is manifestly independent of time. We will refer to such states, which of course 
would include any energy eigenstate of any system with a time-independent Hamiltonian, 
as “stationary”. 

In a stationary state, there is nothing about the isolated quantum system that is true 
at one time but not true at another time. There is no sense, therefore, in which anything 
is dynamically fluctuating into existence. Nevertheless, we often informally talk about 
“quantum fluctuations” in such contexts, whether we are considering a simple harmonic 
oscillator, an electron in its lowest-energy atomic orbital, or vacuum fluctuations in quan- 
tum field theory. Clearly it is important to separate this casual notion of fluctuations from 
true time-dependent processes. 

To that end, it is useful to distinguish between different concepts that are related to the 
informal notion of “quantum fluctuations”. We can identify three such ideas: 


e Vacuum fluctuations are the differences in properties of a quantum and its classical 
analogue, and exist even in stationary states. 

e Boltzmann fluctuations are dynamical processes that arise when the microstate of a 
system is time-dependent even if its coarse-grained macrostate may not be. 

e Measurement-induced fluctuations are the stochastic observational outcomes resulting 
from the interaction of a system with a measurement device, followed by decoherence 
and branching. 


Let us amplify these definitions a bit. By “vacuum fluctuations” we mean the simple fact 
that quantum states, even while stationary, generally have non-zero variance for observable 
properties. Given some observable O, we expect expectation values in a state |) to satisfy 
(0) ve (O Vs The fact that the position of the harmonic oscillator is not localized to the 
origin in its ground state is a consequence of this kind of fluctuation. Other manifestations 
include the Casimir effect, the Lamb shift, and radiative corrections due to virtual particles 


Boltzmann Brains and the de Sitter Vacuum 231 


in quantum field theory. Nothing in our analysis denies the existence of these kinds of 
fluctuation; we are merely pointing out that they are non-dynamical, and therefore not 
associated with anything literally fluctuating into existence. 

This is in contrast with “Boltzmann fluctuations”, which are true dynamical processes. 
In classical statistical mechanics, we might have a system in equilibrium described by a 
canonical ensemble, where macroscopic quantities such as temperature and density are 
time-independent. Nevertheless, any particular realization of such a system is represented 
by a microstate with true time-dependence; the molecules in a box of gas are moving 
around, even in equilibrium. From a Boltzmannian perspective, we coarse-grain phase 
space into macrostates, and associate to each microstate and entropy S = kg log 2, where 
Q is the volume of the macrostate in which the microstate lives. We then expect rare fluctu- 
ations into lower-entropy states, with a probability that scales as P(AS) ~ e~ 45, where AS 
is the decrease in entropy. Such Boltzmann fluctuations are not expected to occur in sta- 
tionary quantum states, where there is no microscopic property that is actually fluctuating 
beneath the surface (at least in EQM). 

This can even be true in “thermal” states in quantum mechanics. Consider a composite 
system AB, with weak coupling between the two factors, and A much smaller than B. When 
the composite system is in a stationary pure state |y), we expect the reduced density matrix 
of the subsystem to look thermal, 


pa = Tre lW)(y| ~ exp(—BAa) =) e"" |En)(Enl (11.4) 


n 


where A, is the Hamiltonian for A, B is the inverse temperature, and the states |E,) are 
energy eigenstates of Hy. Despite the thermal nature of this density operator, it is strictly 
time-independent, and there are no dynamical fluctuations. 

Finally, we have measurement-induced fluctuations: processes in which we repeatedly 
measure a quantum system and obtain “fluctuating” results. In EQM, the measurement 
process consists of unitary dynamics creating entanglement between the observed system 
and a macroscopic apparatus, followed by decoherence and branching. We can decompose 
Hilbert space into factors representing the system, the apparatus (a macroscopic configu- 
ration that may or may not include observing agents), and the environment (a large set of 
degrees of freedom that we do not keep track of): 


H=Hs ® Ha @HE. (11.5) 


We assume that the apparatus begins in a specific “ready” state, and both the apparatus 
and environment are initially unentangled with the system to be observed.” For simplic- 
ity, imagine that the system is a single qubit in a superposition of up and down. Unitary 


2 The justification for these assumptions can ultimately be traced to the low-entropy state of the early universe. 
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evolution then proceeds as follows: 


IW) = (+)s +1—)s)|ao)aleo)e (11.6) 
— (|+)sla+)a + |-)sla—)a)leo)e (11.7) 
—> |+)sla+)ale+)e + |—)sla—)ale—)e. (11.8) 


The first line represents the system in some superposition of |+-) and |—), while the appa- 
ratus and environment are unentangled with it. In the second line (pre-measurement), the 
apparatus has interacted with the system; its readout value “+” is entangled with the + state 
of the qubit, and likewise for “—”. In the final line, the apparatus has become entangled 
with the environment. This is the decoherence step; generically, the environment states will 
quickly become very close to orthogonal, (e+|e_) ~ 0, after which the two branches of the 
wave function will evolve essentially independently. If we imagine setting up a system in 
some stationary state, performing a measurement, re-setting it, and repeating the process, 
the resulting record of measurement readouts will form a stochastic series of quantities 
obeying the statistics of the Born Rule. This is the kind of “fluctuation” that would arise 
from the measurement process. 

There are several things to note about this description of the measurement process 
in EQM. First, the reduced density matrix ps, = Trg |W)(Y| obtained by tracing over 
the environment is diagonal in a very specific basis, the “pointer basis” for the appara- 
tus [23, 30-33]. The pointer states making up this basis are those that are robust with 
respect to continual monitoring by the environment; in realistic situations, this amounts to 
states that have a definite spatial profile (such as the pointer on a measuring device indicat- 
ing some specific result). Second, branching is necessarily an out-of-equilibrium process. 
The initial state is highly non-generic; one way of seeing this is that the reduced density 
matrix has an initially low entropy Ss4 = Tr ps, log ps. Third, this entropy increases dur- 
ing the measurement process, in accordance with the thermodynamic arrow of time. Given 
sufficient time to evolve, the system will approach equilibrium and the entropy will be 
maximal. At this point the density matrix will no longer be diagonal in the pointer basis (it 
will be thermal, and hence diagonal in the energy eigenbasis). This process is portrayed in 
Figure 11.1. Needless to say, none of these features — a special, out-of-equilibrium initial 
state, in which entropy increases as the system becomes increasingly entangled with the 
environment over time — apply to isolated stationary systems. 

The relationship of fluctuations and observations is worth emphasizing. Consider 
again the one-dimensional harmonic oscillator. We can imagine constructing a projection 
operator onto the positive values of the coordinate, 


Py =i dx |x) (x|. (11.9) 
x>0 
Now in some state |), we can consider the quantity 


p+ = (WIPy ly). (11.10) 
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Figure 11.1 Schematic evolution of a reduced density matrix in the pointer basis. The density matrix 
on the left represents a low-entropy situation, where only a few states are represented in the wave 
function. There are no off-diagonal terms, since the pointer states rapidly decohere. The second 
matrix represents the situation after the wave function has branched a few times. In the third matrix, 
the system has reached equilibrium; the density matrix would be diagonal in an energy eigenbasis, 
but in the pointer basis, decoherence has disappeared and the off-diagonal terms are non-zero. 


In conventional laboratory settings, it makes sense to think of this as “the probability that 
the particle is at x > 0”. But that is not strictly correct in EQM. There is no such thing as 
“where the particle is”; rather, the state of the particle is described by its entire wave func- 
tion. The quantity p+ is the probability that we would observe the particle at x > 0 were 
we to measure its position. Quantum variables are not equivalent to classical stochastic 
variables. They may behave similarly when measured repeatedly over time, in which case 
it is sensible to identify the non-zero variance of a quantum-mechanical observable with 
the physical fluctuations of a classical variable, but the state in EQM is simply the wave 
function, not the collection of possible measurement outcomes. 


11.3 Boltzmann Brains and de Sitter Space 


With this setup in mind, the application to de Sitter space is straightforward. As mentioned 
in the Introduction, observers in de Sitter are surrounded by a horizon with a finite entropy. 
In the vacuum, the quantum state in any horizon patch is given by a time-independent 
thermal density matrix, 


ee a mae diy 


where the Hamiltonian is the static Hamiltonian for the fields in that patch and B « 1/VA. 

According to the analysis in the previous section, this kind of thermal state does not 
exhibit dynamical fluctuations of any sort, including into Boltzmann Brains. It is a sta- 
tionary state, so there is no time-dependence in any process. In particular, the entropy is 
maximal for the thermal density matrix, so there are no processes corresponding to deco- 
herence and branching.* There may be non-zero overlap between some state |brain) and the 
de Sitter vacuum, but there is no dynamics that brings that state into existence on a decoher- 
ent branch of the wave function. Indeed, one way of establishing the thermal nature of the 


3 The idea that quantum fluctuations during inflation are responsible for the density perturbations in our current 
universe is unaffected by this reasoning. During inflation the state is nearly stationary, with non-dynamical 
vacuum fluctuations as defined above; branching and decoherence occur when the entropy ultimately increases, 
for example during reheating. 
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state is to notice that a particle detector placed in de Sitter space will come to equilibrium 
and then stop evolving [25]. Therefore, Boltzmann Brains do not fluctuate into existence in 
such a state, and should not be counted among observers in the cosmological multiverse. 

It is useful to contrast this situation with that of a black hole in a Minkowski back- 
ground. There, as in de Sitter space, we have a horizon with a non-zero temperature and 
finite entropy. However, real-world black holes are not stationary states. They evaporate 
by emitting radiation, and the entropy increases along the way. A particle detector placed 
in orbit around a black hole will not simply come to equilibrium and stop evolving; it 
will detect particles being emitted from the direction of the hole, with a gradually increas- 
ing temperature as the hole shrinks. This is a very different situation than the equilibrium 
de Sitter vacuum. 

It remains to determine whether the stationary vacuum state is actually attained in the 
course of cosmological evolution. There are classical and quantum versions of the cosmic 
no-hair theorem [16, 19, 28]. Classically, the spacetime geometry of each horizon-sized 
patch of a universe with A > O asymptotically approaches that of de Sitter space, as 
long as it does not contain a perturbation so large that it collapses to a singularity. In the 
context of quantum field theory in curved spacetime, analogous results show that each 
patch approaches the de Sitter vacuum state. Intuitively, this behavior can be thought of 
as excitations leaving the horizon and not coming back, as portrayed in the first part of 
Figure 11.2. The timescale for this process is parametrically set by the Hubble time, and 
will generally be enormously faster than the rate of Boltzmann fluctuations in states that 
have not quite reached the vacuum. Hence, if we think of conventional ACDM cosmology 
in terms of semiclassical quantum gravity, it seems reasonable to suppose that the model 
does not suffer from a Boltzmann Brain problem. 

The situation is somewhat different if we take quantum gravity into account. In this 
case we are lacking a fully well-defined theory, and any statements we make must have a 
conjectural aspect. A clue, however, is provided by the idea of horizon complementarity [3, 
4,21, 26, 27]. According to this idea, we should only attribute a local spacetime description 
to regions on one side of any horizon at a time, rather than globally. For example, we could 
describe the spacetime outside of a black hole, or as seen by an observer who has fallen into 
the black hole, but should not use both descriptions simultaneously; the rest of the quantum 
state can be thought of as living on a timelike “stretched horizon” just outside the null 
horizon itself. Applying this philosophy to de Sitter space leads to the idea that the whole 
theory should be thought of as that of a single horizon patch, with everything normally 
associated with the rest of the universe actually encoded on the stretched cosmological 
horizon. Since the patch has a finite entropy (approximately 10!7? for the measured value of 
A), the corresponding quantum theory is (plausibly) finite-dimensional, with dim H = e’. 

Hence, applying horizon complementarity to a universe with a single true de Sitter vac- 
uum, the intuitive picture behind the cosmic no-hair theorem no longer applies. There is 
no outside world for perturbations to escape to; rather, they are absorbed by the stretched 
horizon, and will eventually be emitted back into the bulk spacetime, as shown in the mid- 
dle part of Figure 11.2. This is consistent with our expectations for a quantum theory on 
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Figure 11.2 Conformal diagrams for de Sitter space in different scenarios. We consider an observer 
at the north pole, represented by the line on the left boundary and their causal diamond (solid trian- 
gle). The wavy line represents excitations of the vacuum approaching the horizon. In QFT in curved 
spacetime, portrayed on the left, the excitation exits and the state inside the diamond approaches 
the vacuum, in accordance with the cosmic no-hair theorems. In contrast, horizon complementarity 
implies that excitations are effectively absorbed at the stretched horizon (dashed curve just inside the 
true horizon) and eventually return to the bulk, as shown in the middle diagram. The third diagram 
portrays the situation when the de Sitter minimum is a false vacuum, and the full theory contains a 
state with A = 0; the upper triangle represents nucleation of a bubble of this Minkowski vacuum. In 
that case, excitations can leave the apparent horizon of the false vacuum while remaining inside the 
true horizon; we then expect there to be no dynamical Boltzmann fluctuations. 


a finite-dimensional Hilbert space, which should exhibit fluctuations and Poincaré recur- 
rences. This was the case originally considered by Dyson, Kleban and Susskind [2, 7] in 
their examination of what is now known as the Boltzmann Brain problem. Nothing in our 
analysis changes this expectation; if Hilbert space is finite-dimensional and time evolves 
eternally, it is natural to expect that the Boltzmann Brain problem is real. (Although see 
Albrecht [1] for one attempt at escaping this conclusion.) 

The situation changes if the de Sitter vacuum state is only metastable, and is embedded 
in a larger theory with a A = 0 minimum. In that case the underlying quantum theory will 
be infinite-dimensional, since the entropy of Minkowski space is infinite. In a semiclassi- 
cal solution based on the de Sitter vacuum, the dynamics will not be completely unitary, 
since there will be interactions (such as bubble nucleations) connecting different sectors 
of the theory. Although a full understanding is lacking, intuitively we expect the dynamics 
in such states to be dissipative, much as higher-energy excitations of metastable minima 
decay away faster in ordinary quantum mechanics. The Poincaré recurrence time is infi- 
nite, so there is no necessity for Boltzmann fluctuations or recurrences. The spacetime 
viewpoint relevant to this case is portrayed in the third panel of Figure 11.2. The existence 
of Coleman—De Luccia transitions to the A = O vacua permits the true horizon size to be 
larger than the de Sitter radius, so perturbations can appear to leave the horizon and never 
return, even under complementarity. 
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The complete picture we suggest is therefore straightforward. If we are dealing with 
de Sitter vacua in a theory with an infinite-dimensional Hilbert space, we expect horizon 
patches to evolve to a stationary quantum vacuum state, and there to be no dynamical 
fluctuations, and the Boltzmann Brain problem is avoided. This applies to QFT in curved 
spacetime, or to complementarity in the presence of A = 0 vacua. If, on the other hand, 
the Hilbert space is finite, fluctuations are very natural, and the Boltzmann Brain problem 
is potentially very real. 


11.4 What Happens in the Wave Function? 


We have been careful to distinguish between vacuum fluctuations in a quantum state, which 
can be present even if the state is stationary, and true dynamical processes, such as Boltz- 
mann and measurement-induced (branching) fluctuations. One may ask, however, whether 
our interpretation of stationary states in EQM is the right one. More specifically: is it 
potentially legitimate to think of a stationary quantum state as a superposition of many 
time-dependent states? This is a particular aspect of a seemingly broader question: what 
“happens” inside the quantum wave function? 

One way to address this question is by using the decoherent (or consistent) histories 
formalism [10, 12—15, 20]. This approach allows us to ask when two possible histories of 
a quantum system decohere from each other and can be assigned probabilities. We might 
want to say that an event (such as a fluctuation into a Boltzmann Brain) “happens” in the 
wave function if that event occurs as part of a history that decoheres from other histories in 
some consistent set. (Although we will argue that, in fact, this criterion is too forgiving.) 

Consider a closed system described by a density operator p(t) at an initial time fo. We 
want to consider possible coarse-grained histories of the system, described by sequences 
of projection operators {Pk These operators partition the state of the system at some time 
into mutually exclusive alternatives and obey 


See Pape tes (11.12) 
a 


A history is described by a sequence of such alternatives, given by a sequence of projectors 
at specified times, Pr (Ei) deed Po (t,)}. At each time ¢;, we have a distinct set of projectors 


po, and the particular history is described by a vector of specific projectors labeled by a. 
The decoherence functional of two histories @ and a’ is 


D(G, a!) = TPS? (tm) --- PS) (01) Go) Py? (ta) -- PS (tn), (11.13) 


where the trace is taken over the complete Hilbert space. If the decoherence functional 
vanishes for two histories, we say that those histories are consistent or decoherent, and 
they can be treated according to the rules of classical probability theory. 

Following a suggestion by Lloyd [18], we can apply the decoherent histories formalism 
to a simple harmonic oscillator in its ground state. One choice of projectors are those given 
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by the energy eigenstates |E,,) themselves, 
Py = |En) (En. (11.14) 


It is easy to check that the corresponding histories trivially decohere. This simply reflects 
the fact that the system begins in an energy eigenstate and stays there. 

But we are free to consider other sets of projectors as well. Let us restrict our attention to 
an N-dimensional subspace of the infinite-dimensional oscillator Hilbert space, consisting 
of the span of the energy eigenstates |E,,) with n between 0 and N — 1. Then Lloyd [18] 
points out that we can define phase states 


N-1 
yen) (11.15) 
n=0 


1 
lm) = JN 
These have the property that they evolve into each other after timesteps At = 277 /Na, 


e HHA 6) = |bmt1)- (11.16) 


Now we can consider histories defined by the phase projectors 


Pm = lm) (Om; (11.17) 


evaluated at times separated by Ar. These histories, it is again simple to check, also 
mutually decohere with each other (although, of course, not with the original energy- 
eigenstate histories). Each such history describes a time-dependent system, whose phase 
rotates around, analogous to a classical oscillator rocking back and forth in its potential.4 

We therefore have two (and actually, many more) sets of histories, which decohere 
within the sets, but are mutually inconsistent with each other. In some sets there is no 
time-dependence, while in others there is. In the stationary thermal state of a de Sitter hori- 
zon patch, there is no obstacle in principle to defining a set of decoherent histories with the 
properties that some of them describe Boltzmann Brains fluctuating into existence. On the 
other hand, we are not forced to consider such histories; there are also consistent sets in 
which the states remain perfectly stationary. 

This situation raises a fundamental puzzle. When we are doing multiverse cosmology, 
we often want to ask what is seen by observers satisfying certain criteria (which may be as 
general as “all intelligent observers” or as specific as “observers in precisely defined local 
conditions”). To answer that question, we want to know whether an amplitude correspond- 
ing to such an observer is actually physically realized in the quantum state of the universe. 
The decoherent histories formalism seems to give an ambiguous answer to the question: 


4 Tt was not necessary to carefully choose the phase states. In the decoherent histories formalism, we have the 
freedom to choose projection operators separately at each time. Given some initial projectors, we can always 
define projectors at later times by simply evolving them forward by an appropriate amount; the resulting 
histories will decohere. We thank Mark Srednicki for pointing this out. 
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the number of observers who physically appear in the universe depends on the projec- 
tion operators we choose to define our coarse-grained histories. This seems to introduce an 
unacceptably subjective element into a purportedly objective calculation. (A closely related 
problem has been emphasized by Kent [17].) 

Our own conclusion from this analysis is simple: the existence of decoherent histories 
describing certain dynamical processes is not sufficient to conclude that those processes 
“really happen”. Note that something somewhat stronger is going on in the standard 
description of branching and decoherence in EQM. There, the explicit factorization of 
Hilbert space into system+apparatus and environment directly implies a certain appropri- 
ate coarse-graining for the macroscopic variables (by tracing over the environment). Of 
course, there is arguably a subjective element in how we define the environment in the first 
place. That choice, however, relies on physical properties of the theory, in particular the 
specific Hamiltonian and its low-energy excitations around some particular background 
state. There have been suggestions that the decoherence properties of realistic systems can 
be defined objectively, by demanding that records of the macroscopic configuration be 
stored redundantly in the environment [22]. 

We conjecture, at least tentatively, that the right way to think about observers fluctuating 
into existence in quantum cosmology is to define an objective division of the variables into 
“macroscopic system” and “environment”, based on the physical properties of the system 
under consideration, and to look for true branching events where the reduced density matrix 
of the system decoheres in the pointer basis.> Work clearly remains to be done in order to 
turn this idea into a well-defined program, as well as to justify why such a procedure is 
the appropriate one. In this context, it is useful to keep in mind that Boltzmann Brains are 
a difficulty, not a desirable feature, of a given cosmological model. We suggest that the 
analysis presented here should at the very least shift the burden of proof onto those who 
believe that Boltzmann Brains are a generic problem. 
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5 A possible alternative strategy is to look for histories that obey classical equations of motion, as in Gell-Mann 
and Hartle [9, 10]. Such an approach seems unable to resolve the ambiguity presented by the simple harmonic 
oscillator. 
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12.1 Introduction 


The formalism of holographic space-time (HST) is an attempt to write down a theory 
of quantum gravity which can treat space-times more general than those accessible to 
traditional string theory. String theory, roughly speaking treats space-times which are 
asymptotically flat or anti-de Sitter (AdS). As classical space-times, these contain infinite 
area causal diamonds on which strict boundary conditions are imposed. The corresponding 
quantum theory has a unique ground state, and the existing formalism describes the evolu- 
tion of small fluctuations around that ground state in terms of evolution operators involving 
the infinite set of possible small fluctuations at the boundary. Local physics is obscure in 
the fundamental formulation of the theory. It emerges only by matching the fundamental 
amplitudes to those of an effective quantum field theory, in a restricted kinematic regime. 
In the AdS case, one must also work in a regime where the AdS radius is much larger than 
the length scale defined by the string tension. That string length scale is bounded below 
by the Planck length. In regimes where the two scales are close, there are no elementary 
stringy excitations. 

The HST formalism works directly with local quantities. Its important properties are 
summarized as follows: 


e The fundamental geometrical object, a time-like trajectory in space-time, is described 
by a quantum system with a time dependent Hamiltonian. Four times the logarithm of 
the dimension (= entropy) of the Hilbert space of the system is viewed as the quantum 
avatar of the area of the holographic screen of the maximal causal diamond along the 
trajectory. The causal diamond associated with a segment of a time-like trajectory is 
the intersection of the interior of the backward light cone of the future endpoint of the 
segment, with that of the future light cone of the past point. The holographic screen is 
the maximal area surface on the boundary of the diamond. 

When the entropy of Hilbert spaces is large, space-time geometry is emergent. The case 
of infinite dimension must be treated by taking a careful limit. The Hilbert space comes 
with a built-in nested tensor factorization: H = Hin(t) @ Hour(t). t is a discrete param- 
eter, which labels the length of a proper time interval along the trajectory. Hin(t) is 
the Hilbert space describing the causal diamond of that interval. Ho,;(t) describes all 
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operators, which commute with those in the causal diamond. We adopt the prescription 
from quantum field theory (QFT) that space-like separation is encoded in commutivity of 
the operator algebra of a causal diamond, with that associated to the region of space-time 
that is space-like separated from that diamond. The holographic bound on the entropy 
of a finite area diamond allows us to state this in terms of simple tensor factorization. 
The Hamiltonian of the system must be time dependent, in order to couple together only 
degrees of freedom (DOF) in H/j,(f) for time intervals shorter than tf. 

e Time is fundamental but relative in HST, while space-time is emergent. By relativity of 
time, we mean that each time-like line has its own quantum description of the world. 
Space-time is knit together from the causal diamonds of all intervals along a sufficiently 
rich sampling of trajectories. For each pair of diamonds, Lorentzian geometry gives a 
maximal area diamond in the intersection. The quantum version of this notion is the iden- 
tification of a common tensor factor in the Hilbert spaces of these two quantum systems. 
The initial conditions and Hamiltonians of the two systems must be such that the den- 
sity matrices prescribed by the two systems for that common tensor factor are unitarity 
equivalent. This is an infinite number of constraints on the dynamics, and the require- 
ment is quite restrictive. For example, in asymptotically flat space-time, the requirement 
of Lorentz invariance of the scattering matrix is a consequence of these consistency con- 
ditions. Note that in HST geometry is an emergent property of quantum systems, but the 
metric is not a fluctuating quantum variable. The causal structure and conformal factor 
(which determine the metric) are determined by the area law and the overlap rules, which 
are not operators in the Hilbert space. 

e The fact that geometry is not a quantum variable fits very nicely with Jacobson’s deriva- 
tion of Einstein’s equations [2] as the hydrodynamics of a quantum system whose 
equation of state ties entropy to geometry via the area law for causal diamonds. Hydrody- 
namic equations are classical equations valid in high entropy quantum states of systems 
whose fundamental variables have nothing to do with the hydrodynamic variables (the 
latter are ensemble expectation values of quantum operators). There is only one situation 
in which quantized hydrodynamics makes sense: small, low energy fluctuations around 
the ground state (of a system that has a ground state). This accounts for the success of 
quantum field theory in reproducing certain limiting boundary correlation functions in 
asymptotically flat and AdS space-times. A notable feature of Jacobson’s derivation of 
Einstein’s equations from hydrodynamics is that it does not get the cosmological con- 
stant (CC) term. Fischler and I have long argued that the CC is an asymptotic boundary 
condition, relating the asymptotics of proper time and area in a causal diamond. In the 
quantum theory it is a regulator for the number of states. If the CC is positive, the number 
of states is finite. If the CC is negative, it determines the asymptotic growth of the den- 
sity of states at high energy. High energy states are all black holes of large Schwarzschild 
radius. In HST, the value of the CC is one of the characteristics that determines different 
models of quantum gravity, with very different Hamiltonians and fundamental DOF! 


See the discussion of meta-cosmology below for a model that incorporates many different values of the CC into 
one quantum system. 
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e In four non-compact dimensions, the variables of quantum gravity are spinor functions 
i (p) (p is a discrete finite label, which enumerates a cutoff set of eigenfunctions of the 
Dirac operator on compact extra dimensions of space). These label the states, described 
by local flows of asymptotically conserved quantum numbers, through the conformal 
boundary of Minkowski spacetime. The Holographic/Covariant Entropy Principle is 
implemented on a causal diamond with finite area holographic screen by cutting off 
the Dirac eigenvalue/angular momentum on the two sphere. The fundamental relation is 


m(RMp)? = LN(N + 1), 


where R is the radius of the screen, L the number of values of p and N the angular 
momentum cutoff. i and A range from | to N and N + 1, respectively. 

e We have constructed [11, 19, 20] a class of models describing scattering theory in 
Minkowski space. The basic idea of those models is that localized objects are described 
by constrained states, on which of order EN, with E < N — o, matrix elements of 
the square matrices Mp, Q= vi J(p) i (q) vanish, as the size of the diamond goes to 
infinity. The Hamiltonian has the form 


1 
Hin(N) = E+ xyTt P(M), (12.1) 


where P is a polynomial of N independent order > 7. The quantity E defining the con- 
straint is an asymptotically conserved quantum number. The constraints imply that the 
matrices can be block diagonalized and, when combined with the large N scaling of 
the Hamiltonian, the blocks are free objects: The Hamiltonian is a sum of commuting 
terms, describing individual blocks. It can be shown that in the limit in which the indi- 
vidual blocks have large size, these objects are supersymmetric particles. For generic 
choices of P(M) the long range interactions have Newtonian scaling with energy and 
impact parameter. All of these models have meta-stable excitations with the properties 
of black holes, which are produced in particle scattering and decay into particles. Scatter- 
ing amplitudes in which black holes are not produced can be described by Feynman-like 
diagrams, with vertices localized on the Planck scale. We have not yet succeeded in 
imposing the HST consistency conditions for trajectories in relative motion, which we 
believe will put strong constraints on P and on the spectrum of allowed particles.” 

It is easy to convert the models above into models of de Sitter space, by simply keeping 
N finite. This automatically explains both the de Sitter entropy and temperature, the lat- 
ter because the definition of a particle state of energy E involves constraining EN of the 
o(N*) DOF on the horizon. The probability of having such a state, within the random 
ensemble, we call the dS vacuum state* is e~£". This says dS space has a temperature 


The spectrum is encoded in the commutation relations of the variables vA (p). 
It has been known since the seminal work of Gibbons and Hawking [3], that the dS “vacuum state” is actually 
a high entropy density matrix. 
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proportional to the inverse of its Hubble radius. The fact that localized energy corre- 
sponds to a constraint on the degrees of freedom in dS space is already evident from the 
form of the Schwarzschild—de Sitter black hole solution. It has two horizons, given by 
the roots of (1 — 2GM/r — r?/R?). It is easy to verify that the sum of the areas of the 
two horizons is minimized at M = 0, and when M is small, the entropy deficit is that 
expected from a Boltzmann factor e~?7*™, 

e We have constructed [4—7] a fully consistent quantum model of cosmology, in which the 
universe is described by a flat Friedmann—Robertson—Walker (FRW) metric with a stress 
tensor that is the sum of a term with equation of state p = p anda term with p = —p. The 
geometry, as anticipated by Jacobson, is a coarse grained thermodynamic description, 
valid in the large entropy limit, of the quantum mechanics of the system. Homogeneity, 
isotropy and flatness are realized for arbitrary initial states. Homogeneity and isotropy 
are the only obvious ways to satisfy the consistency conditions between the descriptions 
of physics along different trajectories, when each trajectory is experiencing randomizing 
dynamics. Flatness follows from an assumption of asymptotic scale invariance for causal 
diamonds much larger than the Planck scale but much smaller than the Hubble scale of 
the CC. The CC itself is an input, basically a declaration that we stop the growth of the 
Hilbert space, but allow time evolution to proceed forever, with a fixed Hamiltonian, 
which has entered the scaling regime describing the asymptotics of the A = 0 model. 
Note by the way that the initial singularity does not appear. The geometric description 
is valid only in the limit of large entropy/large causal diamonds and the singularity is an 
extrapolation of this limiting behavior back to a time when the causal diamond is Planck 
size. We have called this model Everlasting Holographic Inflation (EHI). It has an infinite 
number of copies of a space-time which is asymptotically a single horizon volume of 
dS. Unlike field theoretic models of eternal inflation, the different horizon volumes are 
constrained to have identical initial conditions and may be viewed as gauge copies of 
each other. In HST stable dS space is a quantum system with a finite dimensional Hilbert 
space [14, 15]. This model has no local excitations, except those which arise as thermal 
fluctuations in the infinite dS era. It is important to note that the EHI model and the 
more realistic model described below are not the same as the HST model of stable dS 
space. The latter model has infinite, rather than semi-infinite proper time intervals, and a 
Hamiltonian, for each trajectory, which satisfies H(T) = H(—T) for each proper time. In 
the limit of infinite dS radius, it approaches the HST model for Minkowski space. Both 
the EHI model and our semi-realistic Holographic Inflation model [1] use the same time 
dependent Hamiltonian H(T), where T runs over a semi-infinite interval. There is no time 
reflection symmetry in the system. EHI and HI differ only in their boundary conditions, 
with the latter having fine-tuned boundary conditions, which guarantee an era where 
localized excitations are approximately decoupled from horizon DOF. In EHI, despite 
its intrinsic time asymmetry, the universe is always in a generic state of its Hilbert space 
at all times, and local excitations, which are of low entropy, because they are defined 
by constrained states on which large numbers of wi (p) vanish, arise only as sporadic 
thermal fluctuations. 
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e Perhaps the most important difference between HST and QFT lies in the counting of 
entropy. In HST the generic state of the variables in a causal diamond of holoscreen 
area ~ R*, has no localized excitations: all of the action takes place on the horizon, 
with a Hamiltonian that is not local on the holographic screen. Bulk localized states are 
constrained states in which of order ER of the o(R?) variables are set equal to zero. R 
is the holoscreen radius, and E <« R (all these equations are in Planck units) is the 
approximately conserved energy (it becomes conserved in the limit R goes to infinity). It 
is only for these constrained states that QFT gives a good description of some transition 
amplitudes. 


Our model of the universe we live in, the topic of this chapter, proceeds from a starting 
point identical to that of the model discussed in the penultimate bullet point. However, we 
consider a much larger Hilbert space, for a single trajectory, which includes many copies 
of a single inflationary horizon volume. This corresponds to the growth of the apparent 
horizon, after inflation, in conventional inflationary models. This model contains elements 
of the conventional narrative about cosmological inflation, so we call it the Holographic 
Inflation model. The purpose of the inflationary era in this model is quite different from 
that which inflation serves in field theory models. Homogeneity, isotropy and flatness are 
natural in HST. We need fine tuning of initial conditions in order to get local excitations, 
and this is the purpose that the inflationary era serves. I will comment below on the degree 
of fine tuning required compared to field theoretic inflation, but the most important point is 
that our fine tuning is the minimal amount required to get localized excitations, so the very 
crudest kind of anthropic reasoning, a topikésthrophic restriction (from the Greek word for 
locality), says that, within the HST formalism, a universe with this amount of fine tuning 
of initial conditions is the only kind that can ever be observed. The only assumption that 
goes into this is that any kind of observer will require the approximate validity of local bulk 
physics. It does not require the existence of human beings, or anything like conventional 
biology. 

The derivation of a macroscopic world from quantum mechanics, a world in which 
the ordinary rules of logic and the notion of decoherent histories are valid, relies on 
the existence of macroscopic objects with macroscopic moving parts. In the HST model 
of quantum cosmology, a typical macroscopic object is the entire apparent horizon; 
slightly less typical ones are localized black holes. None of these have complex webs 
of semi-classical collective coordinates. We know how to derive the existence of com- 
plex macro-objects in quantum field theory, even with an ultra-violet cutoff, but in HST 
quantum states approximately describable by QFT are highly atypical. If we want a uni- 
verse in which such states appear as anything but ephemeral thermal fluctuations, we must 
impose constraints on the initial conditions. Thus, in HST, the reason the universe began 
in a low entropy state, is that this is the only way in which the model produces a complex, 
approximately classical world. 


12.2 The Holographic Inflation Model 


Our cosmology begins on a space-like hypersurface, called the Big Bang, which has 
the topology of three-dimensional flat space. A sampling of time-like trajectories in the 
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emergent space-time are labelled by a regular lattice on the space-time, whose geometry 
is irrelevant. Probably a general three-dimensional simplicial complex, with the simplicial 
homology of flat space, is sufficient. The quantum avatar of each trajectory is an indepen- 
dent quantum system, whose Hilbert space is finite dimensional if the ultimate value of the 
CC is positive. 

We incorporate causality into our quantum system, by insisting that the Hamiltonian is 
proper time dependent along the trajectory and has the form 


A (tn) = Hin(tn) + Aout (tn). (12.2) 
At proper time f, the horizon has area 
m(R,My)? = Ln(n + 1). (12.3) 


Hin(tn) is constructed from the matrices M built out of the subalgebra of wi with indices 
i<nandA <n+1. Aout(tn) depends only on the rest of the variables. In our simplest 
models, we will not have to specify Hoy, but it will be crucial to our Holographic Inflation 
Model. 

All of our models choose exactly the same sequence of Hamiltonians, and the same 
initial state, for each trajectory in our lattice. The initial state is, however, completely arbi- 
trary, so there is no fine tuning of initial conditions. Along each trajectory, the Hamiltonian 
Hin(t,) is the trace of a polynomial, P, in the matrices M(t,). The coefficients in P are 
chosen randomly at each time f,. Systems like this almost certainly have fast scrambling 
behavior [16]. The maximal order of the polynomial is fixed and < N, but we do not yet 
know what it is beyond a bound degP > 7. This bound is required to get the proper scal- 
ing of Newton’s law in regimes that are approximately flat space [17]. Finally, we require 
that, when n is large, but < N, the Hamiltonian approach that of a 1 + | dimensional C 
(onformal) F(ield) T(heory) on a half line, with central charge of order n’. The system is 
obviously finite dimensional, so this statement can only make sense in the presence of UV 
and IR cutoffs on the CFT . We choose them to be Ayy ~ 1/n, Vir ~ Ln, and also insist 
that L >> | so that the CFT behavior will be manifest in the presence of the cutoffs. 

The fast scrambling nature of the dynamics implies that we can make thermodynamic 
estimates of the expectation value of Hj, and the entropy of the time-averaged density 
matrix 


E ~ Ln, 
S~ Ln’. 


The volume of the bulk region inside the horizon at this time scales like n>, so the energy 
and entropy densities scale like 


Holographic Inflation Revised 247 


If we recall that in a flat FRW metric the horizon size n, scales like the cosmological time 
t we recognize the first of these equations as the Friedmann equation and the second as 
the relation between entropy and energy densities for an equation of state p = p. This 
geometric description was to be expected, from Jacobson’s argument [2] showing that Ein- 
stein’s equations are the hydrodynamics of systems obeying the area/entropy connection. 
Note that the geometric/hydrodynamic description should not be extrapolated into the low 
entropy regime of small n, so that the cosmological singularity of the FRW cosmology 
is irrelevant. Note also that the quantum mechanics of this system is in no sense that of 
the quantized Einstein equations. Quantized hydrodynamics is a valid approximation for 
describing small excitations of a system around its ground state. In the HST models, the 
early universe is very far from its ground state, and does not even have a time-independent 
Hamiltonian.* 

In the EHI models, the quantum systems along different trajectories are knit together 
into a space-time, by specifying at each time ¢, that the maximal causal diamond in the 
intersection between causal diamonds of two trajectories that are D steps apart on the 
lattice is the tensor factor in each Hilbert space, on which Hin (t, — D) acted. The choice of 
identical Hamiltonians and initial state for each trajectory insures that the density matrix 
on this tensor factor at time f, is independent of which trajectory we choose to view it 
from. This infinite set of consistency conditions is the fundamental dynamical principle of 
HST. The locus of all points D steps away on the lattice is identified by this choice as a 
set of points on the surface of a sphere in the emergent space-time, because of the causal 
relations. The fact that the dynamics is independent of the point is rotation invariance on 
that sphere, and this is consistent with the fact that our fundamental variables transform as a 
representation of SU(2) and the Hamiltonian is rotation invariant. We see that homogeneity, 
isotropy and flatness are properties of the space-time of this model, which are independent 
of the choice of initial state. In our more realistic models of the universe, in which local 
objects emerge from a choice of fine tuned initial conditions, we do not yet have a solution 
to the consistency conditions for trajectories with Planck scale spacing, but homogeneity 
and isotropy play a crucial role in satisfying a coarse grained verse of the consistency 
conditions for trajectories whose spatial separation is of order the inflationary horizon size. 

Asn — N, we need to change the rules only slightly. Proper time is now decoupled from 
the growth of the horizon. We model this by allowing the system to propagate forever with 
the Hamiltonian Hj, (NV). In addition, we do an (approximate) conformal transformation on 
the Hamiltonian, rescaling the UV and IR cutoffs so that the total Hamiltonian is bounded 
by 1/n > 1/N. This is analogous to the transformation from FRW to static observer coor- 
dinates [18] in an asymptotically dS universe, and is appropriate because we are postulating 


4 Tn passing we note that the only known quantum gravitational systems with a ground state are asymptotically 
flat and anti-de Sitter space-times. The proper description of these is String Theory-AdS/CFT, and certain 
amplitudes in the quantum theory are well approximated by QFT. IMHO, string theorists and conventional 
inflation theorists make a mistake in trying to extrapolate that approximation to the early universe. 
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a time independent Hamiltonian. Finally we have to modify the overlap rules, to be consis- 
tent with the fact that individual trajectories have a finite dimensional Hilbert space. The 
new overlaps are the same as the old ones, except that points that are more than N steps 
apart on the lattice never have any overlap at any time. The asymptotic causal structure is 
then that of dS space, with Hubble radius satisfying (RyMp)* = LN(N + 1). 

The simplest metric interpolating between the p = p and p = —p, equations of state is 
a(t) = sinh!/3 (3t/Ry), which is an exact solution of Einstein’s equations with a fluid that 
has two components with these equations of state. All of the geometric information in our 
model is consistent with this ansatz for the metric. The space-time of our model contains 
no localized excitations. At all times, all DOF are localized on the apparent horizon, in 
a completely thermalized state, obeying none of the constraints which characterize bulk 
localized systems in HST [11, 19, 20]. This EHI model is not a good model of the universe 
we inhabit, although it is, according to the rules of HST, a perfectly good model of quan- 
tum gravity. Localized excitations will occur only as ephemeral thermal fluctuations in the 
eternal dS phase of this cosmology. 


12.2.1 A More Realistic Description of the Universe 


The description of localized objects in HST was worked out in a series of papers [11, 19, 
20] devoted to scattering theory in Minkowski space-time. Here we summarize the results 
and explain how they are used to construct the HST version of inflationary cosmology. The 
variables wi (p) are sections of the Dirac-cutoff spinor bundle on the two sphere, so the 
matrices uM! (p, g) are sections of the bundle of differential forms on the sphere. Two forms 
can be integrated over the sphere and the fuzzy analog of integration is the trace, which 
we have used in constructing our model Hamiltonians. Expressions involving a trace of a 
polynomial in M are invariant under unitary conjugation and this invariance converges to 
the group of measure preserving transformations on the sphere (we do not require them 
to be smooth). Saying that a matrix is block diagonal in some basis can be interpreted 
as saying that the corresponding forms vanish outside of some localized region on the 
sphere. In our quantum mechanics, the matrix elements of M are operators, so a statement 
that some of them vanish is a constraint on the states. If this constraint is approximately 
preserved under the action of the Hamiltonian taking us from one causal diamond to the 
next, then this defines the track of a localized object in space-time. 

This connection between localization, and constraints that put the system in a low 
entropy state, is supported by a piece of evidence from classical GR. An object in dS 
space, localized in a region much smaller than the dS radius, will have a dS Schwarzschild 
field. The local entropy will be maximized if the object is a black hole, filling the local- 
ization region, but independently of that choice, the entropy of the horizon shrinks, so that 
the total entropy of the system is less than that of empty dS space. Recall that empty dS 
space is a thermal system, with DOF that must be considered to live on the cosmological 
horizon. The idea that a localized object of energy E is a constraint on a system with o(N7) 
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DOF, which freezes o(EN) of them, explains the scaling of the temperature with dS radius. 
In Banks and colleagues [11, 19, 20] we showed that the same idea explains the conser- 
vation of energy in Minkowski space, as well as the critical impact parameter at which 
particle scattering leads to black hole production, the temperature/mass/entropy of black 
holes, and even the scaling of large impact parameter scattering with energy and impact 
parameter (Newton’s Law). 

All of these results are quite explicit calculations in our quantum mechanical matrix 
models, and they mirror scaling laws usually derived from classical GR. It is crucial to all 
of them that the block diagonal constraint on matrices gives us a finite number of blocks of 
size K;, and one large block of size N — K > 1, where K = >° Kj, and turns out to be the 
conserved energy. The individual Kj represent the amount of energy going out through the 
horizon in different angular directions. Energy is only conserved in the limit N — ov, since 
in that limit the Hamiltonian cannot remove O(KN) constraints. A final result from the 
matrix model shows that large impact parameter scattering amplitudes scale with energy 
and impact parameter as one would expect from Newton’s law. Again, the existence of a 
vast set of very low (o(1/N)) energy DOF, which do not have a particle description, is 
crucial to this result. In passing, I remark that these DOF resolve the firewall paradox of 
AMPS [9-12]. 

The correlation between localization and low entropy is the key to answering Penrose’s 
question® of why the initial state of the universe had low entropy. We have already seen 
that an unconstrained initial state of the universe leads to the EHI cosmology, whose 
only localized excitations are low entropy thermal fluctuations late in the dS phase of the 
universe. 

The maximal entropy state with localized excitations, is one in which those excitations 
are black holes. As the horizon expands one may encounter more black holes or the process 
may stop at some fixed horizon size. The maximal set of black holes for a fixed horizon 
size R in Planck units is constrained by the inequality 


bee <R. (12.4) 


Note that this constraint can be derived either from the geometric requirement that the 
Schwarzschild radius of the total black hole mass be smaller than the horizon size, or from 
the matrix model constraint that the black hole blocks of the matrix fit inside the full matrix 
available at that value of the horizon size. 

At this point, we must recognize the distinction between time in the matrix model, which 
represents the time along a particular trajectory, and FRW time. The time slices in the 
matrix model always correspond to “hyperboloids" lying between the boundaries of two 
successive causal diamonds, while FRW time corresponds to horizontal lines. If we say 
that at some fixed causal diamond size, smaller than the dS radius of the ultimate CC, 
the process of new black holes coming into the horizon stops, then trajectories far enough 


5 Which of course goes back to Boltzmann. Penrose was the first to raise the issue in the context of the General 
Relativistic theory of gravity. 
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from the one we have been discussing will see the region around our preferred trajectory 
as a collection of black holes localized in a given region. The system will not be, even 
approximately, homogeneous on the would be FRW slices. Some of the black holes may 
decay into radiation, but many will be gravitationally bound, and coalesce into large black 
holes, which have lower probability of decay. The universe will not look anything like our 
own, and it is unlikely to ever produce complex structures, which could play the role of 
“observers”. It is often said that once a single galaxy forms, the question of the evolution of 
observers is independent of the rest of the cosmos, but that statement depends on defining 
a galaxy as a gravitationally bound structure whose constituents are primarily composed 
of baryonic matter. In these HST models of the universe, the primordial constituents of the 
universe are black holes. Matter, baryonic or otherwise, must be produced in black hole 
decay. 

Although we have not explored all such inhomogeneous scenarios, it seems clear that 
a model with a fairly homogeneous black hole gas is much more likely to produce an 
observer-ready cosmology than an inhomogeneous one. If the gas is sufficiently dilute, 
and the black holes sufficiently small compared to the size of the ultimate cosmological 
horizon, they will all decay before they can coalesce into larger black holes. That decay is 
the hot Big Bang in the HST models. 

It is also, as we will see, much easier to solve the HST constraints relating the descrip- 
tions along different time-like trajectories when the universe is homogeneous and isotropic 
in a coarse grained way. Exact homogeneity and isotropy are incompatible with quantum 
mechanics. Our black holes are really quantum systems with finite dimensional Hilbert 
spaces, with a quantum state that is varying by order one on a time scale nLp, the 
Schwarzschild radius. Occasionally the Hamiltonian will put us in a state where the value 
of n is effectively smaller, because some of the matrix elements vanish. Thus, the black 
hole radius is to be thought of as an expectation value of an operator which is the trace of 
some polynomial in the matrices M. It will have statistical fluctuations in the time averaged 
density matrix of the system. By the usual rules of statistical mechanics these will be small 
and approximately Gaussian, with size 

én 7 1 
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Similarly, although the expectation value of the black hole angular momentum is obviously 
zero, it will have Gaussian fluctuations of the same order of magnitude. Since these are 
fluctuations in collective coordinates of a large chaotic quantum system, and time averaged 
fluctuations at that, it is obvious that quantum interference effects in the statistics of these 
variables are negligible, of order en 

Fluctuations of the mass and angular momentum of a black hole are fluctuations of the 
spin zero and spin two parts of the Weyl Curvature tensor, and thus have the properties 
of scalar and tensor fluctuations in cosmological perturbation theory. We will postpone an 
analysis of the phenomenological implications of this remark until we have completed the 
sketch of our model. 
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Now let us return to the matrix model description of our model. For some period of 
time, which we call the era of inflation, the horizon size remains fixed at n and the system 
evolves with a time independent Hamiltonian 


1 
Ai,(n) = “ah (Mn), (12.5) 


which is the asymptotic Hamiltonian of our HEI model. Then, the horizon begins to 
expand, eventually reaching the cosmological horizon N. At time #, when the horizon 
size is k, some number of black holes will have come into the horizon. If the average black 
hole size is n, the k x k matrices M;, which appear in the Hamiltonian H;,,(t;,) must be in 
a block diagonal state, with some number of blocks py < k/n of size n. The black hole 
energy density is thus 


PRN _ 1 


oe (12.6) 


PBH = 
If we choose % o k, as we expect for any flat FRW model, then the RHS is just the 
Friedmann equation for the energy density if we choose px = nguk, with ngy interpreted 
as an initial black hole number density. 

I do not have space here to sketch the full matrix model treatment of this system, but will 
instead rely on the reader’s familiarity with semi-classical black hole dynamics, together 
with the evidence from Banks and colleagues [11, 19, 20] that thermalized block diagonal 
matrices have qualitative behavior very similar to that of black holes. The initial number 
density of black holes on FRW slices must be < 1/n*, so that the black holes are further 
apart than their Schwarzschild radii. What happens next is a competition between two pro- 
cesses: The growth of fluctuations in a universe dominated by an almost homogeneous gas 
of black holes, and the decay of the black holes. As we will recall below, the size of the 
primordial scalar (mass) fluctuations in the black hole energy density is Q = *e rs a, 
where € is a “slow roll parameter", currently bounded above by ~ 0.1 by observation. In 
the black hole dominated universe, these will grow to o(1) when the scale factor has grown 
by ne, which occurs in FRW time trrw ~ (ne)>/?, which is much less than the black hole 
evaporation time feyayp ~ n>. Black holes will thus begin to combine on this time scale, 
potentially shutting off the evaporation process and leaving us with a universe dominated 
by large black holes, forever. However, since black holes are separated by distances much 
larger than their Schwarzschild radii by this time, the fluid approximation does not cap- 
ture the full coalescence process. We must also estimate the infall time for two widely 
separated black holes in a local over-density to actually collide. It turns out that this time 
is longer than the decay time as long as ngy < n~>. So our model produces a radiation 
dominated universe, with a reheat temperature given roughly by the redshifted black hole 
energy density at the decay time 

2 
a (12.7) 
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Here g is the effective number of massless species into which the black holes can decay. 
The standard model gives g ~ 107. Supersymmetry, particularly with a low energy SUSY 
breaking sector added on boosts this to g ~ 10°. The relation between n and primordial 
fluctuations gives 


Tr <2 "PO = 10° < 10". (12:8) 


This estimate, which takes into account the observations of Q and the observational bound 
on € is in Planck units and corresponds to Try < 107 — 108 GeV. 

What does all of this have to do with inflation? In order to answer that, we have to return 
to the time ¢#, and ask what the Hamiltonian H,,,;(¢) must be doing in order to be consis- 
tent with the fact that a new system with o(n?) DOF is about to be added to Hin (tz), and 
also with the overlap conditions of HST. The first of these constraints says that, at the time 
it comes into “our" trajectory’s horizon, this system must have been completely decoupled 
from the rest of the DOF in the universe. This is consistent with dynamics along the tra- 
jectory that has contained those DOF in its causal diamond since the horizon size grew to 
n, if that trajectory is still experiencing inflation. Here again we must recall the differences 
between the time slicings of individual trajectories and the FRW slicing. The future tip 
of a given trajectory’s causal diamond lies on a particular FRW slice, but its intersection 
with a spatially remote trajectory is at a correspondingly remote FRW time in the past. In 
an asymptotically dS universe, where the scale factor a(n) has a pole at some conformal 
time 70, the last bit of information that comes into the horizon of a given trajectory is on 
an FRW slice with conformal time 79/2. Thus, for an approximately homogeneous and 
isotropic collection of black holes, the universe had to undergo inflation up to this confor- 
mal time. Along each trajectory, until the time f« when the future tip of its diamond hits the 
FRW surface 79/2, the Hamiltonian Hj, (t < t*) is the Hamiltonian of the EHI universe. 

The shortest wavelength fluctuations that occur in this model will have wavelength of 
order nayow/az, where a; is the scale factor at the end of inflation. These are fluctuations 
that came into the horizon just after inflation ends. The largest wavelength fluctuations are 
those which cover the entire sky and have wavelength of order NV. In a conventional inflation 
theory we would write e%e > ee , whereas in the HST model this is a strict equality. 
There are only as many e-folds as we can see. A crude estimate, given the cosmological 
history we have sketched, gives N, ~ 80. This is larger than the conventional lower bound 
N. > 60, because our cosmology has a long period in which the universe is dominated by 
a dilute black hole gas. Reheating does not occur immediately after inflation. 


12.2.2 SO(A, 4) Invariance 


Work of Maldacena and others [21-28] has shown that current data on the CMB can 
be explained in a very simple framework, with no assumptions about particular mod- 
els. Indeed, it was shown in Banks et al. [29] that even the assumption that fluctuations 
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originate from quantized fields is unnecessary. All one needs is the approximate SO(1, 4) 
invariance we will demonstrate below. 

If we consider general perturbations of an FRW metric, then as long as the vorticity of the 
perturbed fluid vanishes (which is an automatic consequence of the symmetry assumptions 
below), we can go to co-moving gauge, where the perturbations are in the scalar and spin 
two components of the Wey] tensor. The conventional gauge invariant scalar perturbation ¢ 
is equal, in this gauge, to the local proper time difference between co-moving time slices, 
in units of the background Hubble scale 


(== =, (12.9) 


This equation alone, if € is small, explains why we have so far seen scalar, but not tensor 
curvature fluctuations. The gauge invariant measure of scalar fluctuations is suppressed if 
the background FRW goes through a period in which the variation of the Hubble parameter 
is smaller than the Hubble scale, which is the essential definition of an inflationary period. 
The rest of the data on scalar fluctuations essentially correspond to fitting the background 
H(t). The HST and field theory based models have a different formula [29, 30] relating 
H(t) to the form of the scalar two point function, but if both models have an approximate 
SO(1, 4) symmetry, they can both fit the data. 

Much is made in the inflation literature of the fact that inflationary fluctuations are 
approximately Gaussian, and this is supposed to be evidence that free quantum field theory 
is a good approximation to the underlying model. However, Gaussian fluctuations are a 
good approximation to almost any large quantum system in a regime where the entropy 
is large. Higher order correlation functions are suppressed by powers of S~!/*, The HST 
model has approximately Gaussian fluctuations for these general entropic reasons, rather 
than the special form of its ground state. The insistence that a huge quantum system be 
in its ground state is a monumental fine tuning of initial conditions, and should count as a 
strike against conventional inflation models. 

There is a further suppression of non-Gaussian correlations involving at least one scalar 
curvature fluctuation, which follows from Maldacena’s squeezed limit theorem [31] and 
approximate SO(1,4) invariance. Maldacena’s theorem says that in the limit of zero 
momentum for the scalar curvature fluctuation, the three point correlator reduces to some- 
thing proportional to the violation of scale invariance in the two point correlator. The 
symmetry allows us to argue that this suppression is present for all momenta. Since all 
three point functions are suppressed relative to two point functions by a nominal factor of 
10—>, and this theorem gives us an extra power of 10~*, we should not expect to see these 
non-Gaussian fluctuations if the world is described by any one of a large class of models, 
including both HST and many slow roll inflation models. 

The tensor two point function is the most likely quantity to be measured in the near 
future. HST and slow roll models make different predictions for the tilt of the tensor spec- 
trum. HST predicts exact scale invariance while slow roll inflation models have a tilt of 
r/8, where r is the tensor to scalar ratio. Since we now know that r < 0.1 with 95 per cent 


254 Tom Banks 


confidence, it will be difficult though not necessarily impossible to observe this difference. 
Of course, if r is much smaller than its observational upper bound, this observation will 
not be feasible. We should however point out that HST has a second source of gravitational 
waves, the decay of black holes. This will have a spatial distribution that mirrors the scalar 
fluctuations, and so should have the same tilt as the scalars, again disagreeing with the 
predictions of standard slow roll models. It is suppressed by a factor 1/g, the number of 
effective massless species into which the black holes can decay, but with no suppression 
for small €. Finally, I would like to point out that the PIXIE mission [32], will test for short 
wavelength primordial gravitational waves and can probably distinguish even a small tilt 
of the spectrum. 

The tensor three point function, in all models having approximate SO(1, 4) invariance, 
might be the largest of the non-Gaussian fluctuations, unless r ~ 0.04 or less. It does not 
have the ns — | factor from Maldacena’s squeezed limit theorem. It is by far the most inter- 
esting correlation function that humans might someday observe, since symmetry allows 
three distinct forms for the three point function. Quantum field theory models predict that 
one of the three dominates the other two by a factor of n > 10° while the third vanishes to 
all orders in inverse powers of n. HST models predict two of the three form factors are of 
comparable magnitude, while the third appears to vanish only if a certain space-time reflec- 
tion symmetry is imposed as an assumption. Unfortunately, we will probably not be able 
to measure this three point function in the lifetime of any of my auditors at this conference. 

We turn briefly to the derivation of approximate SO(1, 4) invariance of the fluctuations in 
the HST model. As discussed above, the Hamiltonian acting on the Hilbert space of entropy 
Ln? in the HEI model is the Cartan generator of an approximate SL(2) algebra. This is the 
statement that the early universe is described approximately by a | + | dimensional CFT. 
As DOF come into the horizon, the initial state must be constrained so that the matrices 
M(kn; p,q), which appear in Hj,(kn), are all block diagonal, with blocks of approximate 
size n. In order for this to be consistent Ho,;(kn) must act on all of the blocks that have 
not yet come into the horizon, but will in the future, as a sum of independent copies of the 
SL(2) Cartan generator, Lo[a]. This insures that these DOF will have dynamics that mirrors 
a horizon of fixed size. Once the horizon size has expanded to N = Kn, corresponding to 
the observed value of the CC, all of these DOF are embedded in an SU(2) covariant system, 
with entropy LN’. We can organize the DOF, so that SU(2) invariance is preserved at all 
times, by choosing to define the action of SU(2) so that DOF, which have come into the 
horizon when its size is kn, transform in the [kn] ® [kn + 1] representation of SU(2). 

Once the apparent horizon coincides with the cosmological horizon, we can divide the 
entire set of variables up into variables localized at various angles. To visualize this, take 
the sphere of radius ~ Kn = N and draw a spherical grid: an icosahedron with triangular 
faces, each of which is tiled by equilateral triangles of area n*. We call Q; the solid angular 
coordinate of the ith small triangle. Consider the n* most localized linearly independent 
functions we can construct from o(NV 2) spherical harmonics, and make a basis which con- 
sists of these localized functions in one particular tile, and all rotations of them to different 
tiles. 
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The total number of black holes that come into the horizon is certainly no greater than 
K = N/n, and the number which have come in when the horizon size is kn is < k. As 
we have said, the initial state wave function must be such that the matrices M(kn) are 
block diagonal, with a number of small blocks < k. We choose the wave function such 
that the black hole Gf any) which comes in when the horizon grows from kn to (k + I)n 
is a linear superposition of wave functions in which the black hole DOF are taken from 
each of the independent tiles on the cosmological horizon. This makes a state which is 
rotation invariant. We also insist that the rate at which black holes are added is uniform 
in time. This rate is a parameter of the model which, in FRW slicing, is determined by 
the initial black hole number density, ngy, at the end of inflation. The fact that the objects 
being superposed are large quantum systems with a fast scrambler Hamiltonian and a time 
scale n, means that quantum interference terms in this superposition are negligible, of order 
em so the prediction of the model is that there is a classical probability distribution for 
finding black holes at various positions in the emergent FRW space-time. 

The homogeneous and isotropic nature of the black hole distribution, from the point of 
view of one trajectory, makes it easy to satisfy the HST consistency conditions between 
trajectories. We simply choose the same sequence of both in and out Hamiltonians, and 
the same initial state for each trajectory, and let the geometry of FRW tell us what the 
overlap Hilbert spaces are. This is only a coarse grained solution of the consistency con- 
ditions because both our time steps and the spatial separations of the various trajectories 
are of order n, rather than the Planck scale. Note that apart from the fine tuning necessary 
to guarantee a certain number of localized excitations the conditions of homogeneity and 
isotropy arise naturally, and do not require any extra tuning of the initial state. We can cer- 
tainly find other solutions of the consistency conditions with inhomogeneous distributions 
of black holes. However, we have argued that these will typically lead to cosmologies in 
which the entire content of the universe is a few large black holes, which slowly decay 
back into the horizon of empty dS space. 

At any rate, we can combine the local SL(2)(a) groups with the generators of rotations 
to construct an algebra that approximates SO(1, 4). In flat coordinates, the SO(1, 4) algebra 
splits into seven generators whose action is geometrically obvious, and three which act in 
a non-intuitive way. The geometric generators are the rotations and translations of the flat 
coordinates, and the rescaling of the flat space coordinates combined with the translation 
of FRW time (rescaling of conformal time). In terms of familiar rotations and boosts in 
five-dimensional Minkowski space, the time translation generator is Joa, the rotations are 
Jj; and the translations are J;;, where the + refers to light front time, X° + X* in the 
embedding coordinates of dS space in five-dimensional Minkowski space. The action of 
the remaining J_; generators is non-linear in the flat coordinates. 

In the matrix model we define Jj; = €;jxJ, to be the obvious rotation generator. The rest 
of the SO(1, 4) generators are defined in terms of the local SL(2)[a] generators. 


Joa = ) 7 Lo(2lal), 
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and® 
Jai = YO Le(Qla}) Qa. 


These operators operate only on the tensor factor H.5,,;(t,) of the Hilbert space, describing 
DOF which have not yet come into the horizon. The Hamiltonian of these DOF is a sum 
of non-interacting terms, as one would expect for disjoint horizon volumes in dS space. 

The operators described above satisfy the commutation relations of SO(1, 4) with accu- 
racy |/n, when N/n is large. The density matrix of the system outside the horizon is the 
tensor product of density matrices p ((2,) for the individual, angularly localized, systems of 
n* variables. This is approximately invariant under the individual SL(2) generators at fixed 
angle, because of the fact that we have gone through a large number of e-folds of evolution 
with the Hamiltonian Lo(qa). It is also invariant under permutation of the individual blocks 
of n* variables. We have argued that these systems enter the horizon as an SU(2) invariant 
distribution of black holes, if we want the model universe to have a radiation dominated 
era. It follows that the distribution of black hole fluctuations on the sky of each trajectory is 
approximately SO(1, 4) invariant. This is what we need, to fit the data on the CMB, Large 
Scale Structure, and galaxy formation. 


12.3 Meta-Cosmology and Anthropic Arguments 


It is already obvious that our resolution of many of the problems of cosmology relies to 
a certain extent on what are commonly called anthropic arguments. The resolution of the 
Boltzmann/Penrose conundrum of why the universe began in a low entropy state is that 
typical initial states evolve under the influence of the HST Hamiltonian into states which 
consist entirely of apparent horizon filling black holes, and that the system asymptotes to 
the density matrix of empty dS space’ without ever producing localized excitations. It is, I 
would claim, obvious, that any such model cannot have an era with any kind of organized 
behavior that we could call an observer. 

Before proceeding further, I should elaborate on what my “philosophical” stance is 
towards anthropic arguments. As the inventor of one of the first models, which was 
designed to explain the value of the CC on anthropic grounds [33-37], I am certainly 
not someone who rejects the scientific validity of such arguments outright. However, I do 
believe in Albrecht’s razor: “The physicist who has the smallest number of anthropic argu- 
ments in her/his model of the world, wins” [38]. More importantly, I believe that it is clear 
that many of the values of parameters in the standard model cannot be explained anthropi- 
cally [39], especially if one allows for the existence of scales between the Planck scale and 
the scale of electroweak interactions. In particular, anthropic arguments cannot explain the 
existence of multiple generations of quarks and leptons and the bizarre pattern of couplings 
that determine their properties. 


6 [a] is a label for a tile in the spherical grid described above. The sums are over all tiles. 
7 We will discuss more elaborate models, in which the CC itself is selected anthropically, below. 
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Some authors attempt to get around these phenomenological problems by combining 
anthropic arguments with traditional symmetry arguments, but it is not at all clear that this 
makes sense. In particular, in the most popular model for a distribution of parameters in 
the standard model, The String Landscape, the rules seem to imply that models with extra 
symmetries beyond the standard model gauge group are exponentially improbable [8]. My 
own feeling is that, within the context of models where we insist on the standard model 
gauge group, the only way we could have a sensible anthropic explanation of what we 
see would be if there were only one generation of quarks and leptons, and we relied on 
anthropic arguments to determine the weak scale® and insisted that a QCD axion with a 
GUT scale decay constant were the dark matter. Apart from the problem of generations, this 
framework has depressing implications for high energy physics. We will not find anything 
in accelerators we can imagine building. 

Even this framework is not immune to criticism. All known anthropic arguments, which 
rely on detailed properties of the particle physics we know, are about the properties of 
nuclear physics. This is physics at the MeV scale, and we should really be formulating our 
arguments in terms of an effective field theory at that scale. One might imagine arguing 
that nuclear physics would be irreversibly damaged if the underlying gauge theory did not 
consist of SU(3) x U(1) with the up and down quark masses and the QCD scale having 
values close to those in the real world. However, one cannot imagine that the weak inter- 
actions affect nuclear and stellar physics in a way that cannot be mimicked by a host of 
four fermion interactions which are different than those in the standard model. Thus, an 
honest anthropic argument, even one that makes the a priori assumption of our standard 
strong and electromagnetic gauge theory, cannot determine the form of the standard model 
Lagrangian. 

Once one gives up the assumption of life based on the physics and chemistry we know, 
almost all bets are off. We know too little about how physics determines biology, the pos- 
sibility of radically different forms of organization and intelligence, or a host of other 
questions, to even pretend to make anthropic arguments in this wider context, except those 
which rely only on general properties of thermodynamics and gravitation. 

In HST, the questions like the nature of the low energy gauge group and the number of 
generations are determined by the fundamental commutation relations of the variables, and 
are not subject to anthropic selection among possible states in a given model. Parameters 
like n, € and ngy, which characterize our cosmology, may well be anthropically selected, 
subject to inequalities like 1 <«<n<« Nandngy < n>, but in the models studied so far the 
cosmological horizon size is an input parameter. In Banks and Fischler [13] a more general 
model was proposed, in which N is a variable. That model is based on the existence of 
classical solutions of Einstein’s equations in which a single horizon volume of de Sitter 
space is joined onto a stable black hole in the p = o FRW model. The black hole and 
cosmological horizons coincide. There is a completely explicit quantum mechanical HST 
model, whose coarse grained properties match those of this solution. One can also construct 


8 And this leads to unsuccessful predictions of the mass of the Higgs boson. 
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more general solutions, in which such dS black holes, with varying horizon size, relative 
positions and velocities, move in the p = e background. This allows for anthropic selection 
of all of the parameters N, n, €, ngy. The nature of the anthropic arguments is quite different 
from conventional ones, because the HST formalism suggests a relation between N and the 
scale of supersymmetry breaking. I do not have space to discuss this in detail here, but the 
arguments give a plausible explanation of the data. 

Another “problem" that this generalized model solves is the existence of an infinite 
number of late time observers, “Boltzmann Brains”, whose experience is very different 
from our own. One can arrange the initial positions and velocities of the black holes with 
different interior CC, such that collisions always occur on time scales much longer than 
the current age of the universe, but much shorter than the time scale for production of 
Boltzmann Brains. I do not consider this a major triumph, for the same reason that I do 
not think the BB problem is a serious one. The difference between those two time scales 
is so huge that one can invent an infinite number of changes to the theory, which will 
eliminate the BBs without changing anything that we will, in principle, be able to measure. 
BBs are a problem only to a theory which posits that the explanation for the low entropy 
initial conditions of the universe was a spontaneous fluctuation in a finite system. In HST 
there is an entirely different resolution of the Boltzmann—Penrose question, so BBs are a 
silly distraction, which can be disposed of in a way that will never be testable. Indeed, 
much of the structure of the HST model that allows for anthropic selection will remain 
forever beyond our reach, since it depends on initial conditions whose consequences are 
not causally accessible to us until our dS black hole collides with another. Our universe 
then undergoes a catastrophe, on a time-scale of order 10°'Lp ~ 10!° years, somewhat 
analogous to what happens when Coleman—deLuccia bubbles collide. After that time, the 
low energy effective field theory has changed and we will not be around to see a subsequent 
collision, if one occurs. 

As far as I can tell, any model which implements the anthropic principle will suffer from 
similar problems. It must predict many things, which no local observer can ever observe. 
This is the reason that I subscribe to Albrecht’s razor: a non-anthropic explanation for a 
fact about the universe can be tested more thoroughly than an anthropic model can. This 
does not mean that one can ignore the possibility that some of what we observe depends on 
accidental properties of particular meta-stable states of a system larger than anything we 
can observe. I have spent years trying to find a more satisfactory explanation of the value 
of the CC, and I conclude that this will not be possible. The HST model also suggests that 
n,npy and the precise form of the early FRW metric during the transition from inflation 
to the dilute black hole gas phase are also free parameters, which characterize the initial 
state. They are subject to a combination of entropic and anthropic pressures, and I believe 
they can be pretty well pinned down by these arguments. 

On the other hand, in HST the low energy particle content of the model is determined by 
the super-algebra of the variables and is fixed once and for all. Different spectra correspond 
to different candidate models. My hope is that very few of these candidate models actu- 
ally lead to mathematical consistency. We are familiar from string theory and low energy 
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effective field theory that most candidate models of quantum gravity are inconsistent. Even 
string models with four-dimensional NV = | supersymmetry, many of which are consistent 
to all orders in perturbation theory, are expected to be actually consistent for at most dis- 
crete values of the various continuous parameters that label these models in perturbation 
theory. Indeed, generically, there is no reason to believe that the perturbation series is accu- 
rate at any of the consistent points. The string perturbation series determines quantities, the 
scattering amplitudes in Minkowski space, which simply do not exist unless the point in 
parameter space at which the model exists preserves both supersymmetry and a discrete 
phase rotation (R) symmetry, which acts on the supersymmetry generators. Such points 
are very sparse in the space of all parameters. 

The theory of SUSY breaking in HST [14, 40] implies that in the limit that the CC is 
taken to zero, the model becomes super-Poincare invariant, has a discrete R symmetry,” 
and no continuous moduli. There can be no examples of such a model in perturbative 
string theory, since the string coupling itself appears to be a continuous parameter. General 
arguments in effective super-gravity imply that such models are rare, corresponding to 
solving p + 1 equations for p unknowns. Furthermore, in perturbative string theory, one 
can look at the analogous problem of fixing all parameters besides the string coupling 
and finding a model with a discrete R symmetry. Again, such models are rare. It is thus 
plausible to guess that consistent HST models with vanishing CC are rare, and the gauge 
groups and matter content of these models, as well as their parameters, highly constrained. 
It is not out of the question that we will be able to find arguments that the standard model 
is the unique low energy gauge theory, which can arise at such a point. 


12.4 Conclusions 


On an intuitive level, HST models of cosmology are quite simple. The basic principle 
behind them is that bulk localized excitations in a finite area causal diamond are con- 
strained states of variables living on the boundaries of the diamond. The low entropy of 
the state of the early universe is explained by the necessity of having such localized excita- 
tions in an observable universe: the topikésthropic principle. The initial state with maximal 
localized entropy is a collection of black holes. A competition then ensues between black 
hole coalescence and black hole decay, which must end in most of the black holes decaying 
if the universe is ever to develop complex structures. This requires a fairly uniform dilute 
gas of black holes. Absolute uniformity is impossible, because the black holes are finite 
quantum systems undergoing fast scrambling dynamics. This leads to fluctuations in the 
mass and angular momenta of the black holes, of order 
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9 A discrete R symmetry is a discrete symmetry group which acts on the fermionic generators of the SUSY 
algebra. 
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where S is the black hole entropy. These are the fluctuations we see in the sky. 

What is remarkable is that this model actually has all the features traditionally associated 
with inflation. The two crucial ingredients in this conflation of apparently different models 
are the different time slicings associated with the HST model and FRW space-time, and 
the necessity, in the HST model, to describe the evolution of the black holes, before they 
enter the horizon, as decoupled quantum systems of fixed entropy. Each decoupled system 
is the same as the HST model of dS space (with the inflationary Hubble radius, nLp), 
so, if we want a homogeneous distribution of black holes, then FRW time slices up to 
conformal time 79/2 must be described as a collection of decoupled quantum systems 
of fixed entropy. yo is the point in conformal time to which our universe converges as 
the proper time along trajectories goes to infinity. This system thus corresponds to the 
conventional picture of an inflationary universe as many independent horizon volumes of 
dS space. The number of e-folds is fixed, with the precise value of VN. dependent on ngy, 
the primordial density of black holes at the end of inflation. That number density also 
determines the reheat temperature of the universe after black hole decay. It is bounded 
from above by ngy < n>. The size of primordial scalar fluctuations is Q ~ (ne)~!, where 
e= #& is a slow roll parameter. For € ~ 0.1 the CMB data tell us that n ~ 10° and this 
implies that the reheat temperature is less than 10’ — 108 GeV. 

I also sketched the argument that the HST curvature fluctuations should be approxi- 
mately SOC, 4) invariant, which is enough to account for the data, with detailed features 
of the scalar spectrum fit to H(t) the background FRW metric at the end of inflation. This 
is just as in conventional inflation models, but those models make much more specific 
assumptions about the state of the quantum system under discussion, assumptions which 
amount to a massive fine tuning of initial conditions. They also have to go through an elab- 
orate discussion, rarely touched on in the mainstream inflation literature, to justify why 
quantum fluctuations in a quantum ground state decohere into a probability distribution for 
the classical curvature fluctuations. 

In contrast, the HST model has fluctuations arising from localized quantum systems with 
a huge number of states, with the curvature interpreted as in Jacobson [2] as a hydrody- 
namic average property. In more familiar terms, the fluctuations are fluctuations in mass 
and angular momentum of mesoscopic black holes.!° This model also has fine tuning of 
initial conditions, but it is the minimal tuning necessary to obtain a universe with localized 
excitations which are not black holes. 

Unfortunately, current cosmological data do not allow us to distinguish between these 
two very different models or many other more exotic field theory models with multiple 
fields, strange forms of kinetic energy, etc. The most likely observational distinctions to 
be measured in the near future will be short wavelength gravitational waves. In the distant 
future, measurement of the tensor three point function might definitively rule out quantum 
field theory as the source of CMB fluctuations. 


10 Black holes for which quantum fluctuations are not entirely negligible, although already decoherent. 1/n is 


a =p: 
not negligible, but e~” is. 
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Progress and Gravity: 
Overcoming Divisions Between General Relativity and 
Particle Physics and Between Physics and HPS 


J. BRIAN PITTS 


13.1 Introduction: Science and the Philosophy of Science 


The ancient “problem of the criterion” is a chicken-or-the-egg problem regarding knowl- 
edge and criteria for knowledge, a problem that arises more specifically in relation to 
science and the philosophy of science. How does one identify reliable knowledge without 
a reliable method in hand? But how does one identify a reliable method without reliable 
examples of knowledge in hand? Three possible responses to this problem were entertained 
by Roderick Chisholm: one can be a skeptic, or identify a reliable method(s) (“method- 
ism’), or identify reliable particular cases of knowledge (“particularism”) (Chisholm, 
1973). But why should the best resources be all of the same type? Might not some methods 
and some particular cases be far more secure than all other methods and all other particu- 
lar cases? Must anything be completely certain anyway? Why not mix and match, letting 
putative examples and methods tug at each other until one reaches (a personal?) reflective 
equilibrium? 

This problem arises for knowledge and epistemology, more specifically for science and 
the philosophy of science, and somewhere in between, for inductive inference. Reflective 
equilibrium is Nelson Goodman’s method for induction (as expressed in John Rawls’s 
terminology). One need not agree with Goodman about deduction or take his treatment of 
induction to be both necessary and sufficient to benefit from it. He writes: 


A rule is amended if it yields an inference we are unwilling to accept; an inference is rejected if it 
violates a rule we are unwilling to amend. The process of justification is the delicate one of making 
mutual adjustments between rules and accepted inferences; and in the agreement achieved lies the 
only justification needed for either. 

All this applies equally well to induction. An inductive inference, too, is justified by conformity 
to general rules, and a general rule by conformity to accepted inductive inferences. Predictions are 
justified if they conform to valid canons of induction; and the canons are valid if they accurately 
codify accepted inductive practice. (Goodman, 1983, p. 64, emphasis in the original) 


Most scientists and (more surprisingly) even many philosophers do not take Hume’s prob- 
lem of induction very seriously, although philosophers talk about it a lot. As Colin Howson 
notes, philosophers often declare it to be insoluble and then proceed as though it were 
solved (Howson, 2000). I agree with Howson and Hans Reichenbach (Reichenbach, 1938, 
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pp. 346, 347) that one should not let oneself off the hook so easily. That seems especially 
true in cosmology (Norton, 2011). Whether harmonizing one’s rules and examples is suf- 
ficient is less clear to me than it was to Goodman, but such reflective equilibrium surely is 
necessary — although difficult and perhaps rare. 

My present purpose, however, is partly to apply Goodman-esque reasoning only to a spe- 
cial case of the problem of the criterion, as well as to counsel unification within physical 
inquiry. What is the relationship between philosophy of science (not epistemology in gen- 
eral) on the one hand, and scientific cosmology and its associated fundamental physics, 
especially gravitation and space-time theory (not knowledge in general) on the other? 
Neither dictation from philosopher-kings to scientists (the analogue of methodism) nor 
complete deference to scientists by philosophers (the analogue of particularism) is Good- 
man’s method. It is not popular for philosophy to give orders to science, but it once was. 
The reverse is more fashionable, a form of scientism or at least a variety of naturalism. I 
hope to show by examples how sometimes each side should learn from the other. 

While Goodman’s philosophy has a free-wheeling relativist feel that might make many 
scientists and philosophers of science nervous, one finds similar views expressed by a law- 
and-order philosopher of scientific progress, Imre Lakatos. According to him, we should 
seek 


a pluralistic system of authority, partly because the wisdom of the scientific jury and its case law 
has not been, and cannot be, fully articulated by the philosopher’s statute law, and partly because 
the philosopher’s statute law may occasionally be right when the scientists’ judgment fails. (Lakatos, 
1971, p. 121) 


Thus there seems to be no irresistible pull toward relativism in seeking reflective 
equilibrium rather than picking one side always to win automatically. 


13.2 Healing the GR vs. Particle Physics Split 


A second division that should be overcome to facilitate the progress of knowledge about 
gravitation and space-time is the general relativist vs. particle physicist split. Carlo Rovelli 
discusses 


... the different understanding of the world that the particle physics community on the one hand and 
the relativity community on the other hand, have. The two communities have made repeated and 
sincere efforts to talk to each other and understand each other. But the divide remains, and, with the 
divide, the feeling, on both sides, that the other side is incapable of appreciating something basic and 
essential.... (Rovelli, 2002) 


This split has a fairly long history going back to Einstein’s withdrawing from mainstream 
fundamental physics from the 1920s — that largely being quantum mechanics, relativistic 
quantum mechanics and quantum field theory. A further issue pertains to the gulf between 
how Einstein actually found his field equations (as uncovered by recent historical work 
(Renn, 2005; Renn and Sauer, 1999, 2007)) and the much better known story that Einstein 
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told retrospectively. Work by Jiirgen Renn ef al. has recovered the importance of Einstein’s 
“physical strategy” involving a Newtonian limit, an analogy to electromagnetism, and a 
quest for energy-momentum conservation; this strategy ran alongside the better advertised 
mathematical strategy emphasizing his principles (generalized relativity, general covari- 
ance, equivalence, etc.). Einstein’s reconstruction of his own past is at least in part a 
persuasive device in defense of his somewhat lonely quest for unified field theories (van 
Dongen, 2010). Readers with an eye for particle physics will not miss the similarity to 
the later successful derivations of Einstein’s equations as the field equations of a massless 
spin-2 field assumed initially to live in flat Minkowski space-time (Feynman ef al., 1995), 
in which the resulting dynamics merges the gravitational potentials with the flat space- 
time geometry such that only an effective curved geometry appears in the Euler-Lagrange 
equations. One rogue general relativist has recently opined: 


HOW MUCH OF AN ADVANTAGE did Einstein gain over his colleagues by his mistakes? Typi- 
cally, about ten or twenty years. For instance, if Einstein had not introduced the mistaken Principle 
of Equivalence and approached the theory of general relativity (GR) via this twisted path, other 
physicists would have discovered the theory of general relativity some twenty years later, via a path 
originating in relativistic quantum mechanics. (Ohanian, 2008, p. 334, capitalization in the original). 


It is much clearer that these derivations work to give Einstein’s equation than it is what 
they mean. Do they imply that one need not and perhaps should not ever have given up flat 
space-time? Do they, on the contrary, show that theories of gravity in flat space-time could 
not succeed, because their best effort turns out to give curved space-time after all (Ehlers, 
1973)? Such an argument is clearly incomplete without contemplation of massive spin-2 
gravity (Freund et al., 1969; Ogievetsky and Polubarinov, 1965). But it might be persuasive 
if massive spin-2 gravity failed — as it seemed to do roughly when Ehlers wrote (not that 
he seems to have been watching). But since 2010 massive spin-2 gravity seems potentially 
viable again (de Rham ef al., 2011; Hassan and Rosen, 2011; Maheshwari, 1972) (though 
some new issues exist). Do the spin-2 derivations of Einstein’s equations suggest a conven- 
tionalist view that there is no fact of the matter about the true geometry (Feynman et al., 
1995, pp. 112, 113)? Much of one’s assessment of conventionalism will depend on what 
one takes the modal scope of the discussion to be: Should one consider only one’s best 
theory (hence the question is largely a matter of exegeting General Relativity, which will 
favor curved space-time), or should one consider a variety of theories? According to John 
Norton, the philosophy of geometry is not an enterprise rightly devoted to giving a spurious 
air of necessity to whatever theory is presently our best (Norton, 1993, pp. 848, 849). Such 
a view suggests the value of a broader modal scope for the discussion than just our best 
current theory. On the other hand, the claim has been made that the transition from Special 
Relativity to General Relativity is as unlikely to be reversed as the transition from classi- 
cal to quantum mechanics (Ehlers, 1973, pp. 84, 85). If one aspires to proportion belief 
to evidence, that is a startling claim. The transition from classical to quantum mechanics 
was motivated by grave empirical problems; there now exist theorems (no local hidden 
variables) showing how far any empirically adequate physics must diverge from classical. 
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But a constructive derivation of Einstein’s equations from a massless spin-2 shows that one 
can naturally recover the phenomena of GR without giving up a special relativistic frame- 
work in a sense. The cases differ as twilight and day. Ehlers’s remarks are useful, however, 
in alerting one for Hegelian undercurrents or other doctrines of inevitable progress in the 
general relativity literature. A classic study of doctrines of progress is Bury (1920). 


13.3 Bayesianism, Simplicity, and Scalar vs. Tensor Gravity 


While Bayesianism has made considerable inroads in the sciences lately, it is helpful to 
provide a brief sketch before casting further discussion in such terms. I will sketch a rather 
simple version — one that might well be inadequate for science, in which one sometimes 
wants uniform probabilities over infinite intervals and hence might want infinitesimals, for 
example. Abner Shimony’s tempered personalism discusses useful features for a scientifi- 
cally usable form of Bayesianism, including open-mindedness (avoiding prior probabilities 
so close to 0 or | that evidence cannot realistically make much difference (Shimony, 1970)) 
and assigning non-negligible prior probabilities to seriously proposed hypotheses. 

With such qualifications in mind, one can proceed to the sketch of Bayesianism. One 
is not equally sure of everything that one believes, so why not have degrees of belief, and 
make them be real numbers between 0 and 1? Thus one can hope to mathematize logic in 
shades of gray via the probability calculus. Bayes’s theorem can be applied to a theory T 


and evidence E: 
PET) 


P(E) 
One wakes up with degrees of belief in all theories (!), “prior probabilities”. One opens 
one’s eyes, beholds evidence EF, and goes to bed again. While asleep one revises degrees of 
belief from priors P(T) to posterior probabilities P(T|E). Today’s P(T|E) becomes tomor- 
row’s prior P(T)’. Then one does the same thing tomorrow, getting some new evidence E’, 
etc. Now the priors P(T) might be partly subjective. If there are no empirically equivalent 
theories and everyone is open-minded, then eventually evidence should bring convergence 
of opinion over time (though maybe not soon). 

A further wrinkle in the relation between evidence and theory comes from looking at 
the denominator of Bayes’s theorem, P(E) = P(E|T)P(T) + P(E|T|)P(1) + P(E|T2) 
P(T2) +.... While one might have hoped to evaluate evidence theory T simply in light of 
evidence E, this expansion of P(E) shows that such an evaluation is typically undefined, 
because one must spread degree of belief 1 — P(T) among the competitors 7), 72, etc. 
Hence the predictive likelihoods P(E|7,), etc., subjectively weighted, appear unbidden in 
the test of T by E. Theory testing generically is comparative, making essential reference 
to rival theories. This fact is sometimes recognized in scientific practice, but Bayesianism 
can alert one to attend to the question more systematically. 

Scientists and philosophers tend to like simplicity. Simplicity might not be objective, but 
there is significant agreement regarding scientific examples. That is a good thing, because 
there are lots of theories, especially lots of complicated ones, way too many to handle. 


P(T|E) = P(T) 


(13.1) 
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If degrees of belief are real numbers (not infinitesimals), then normalization 4;P; = 1 
requires lots of Os and or getting ever closer to 0 on some ordering (Earman, 1992, pp. 
209, 210). There is no clear reason for prior plausibility to peak away from the simple end. 
Plausibly, other things equal, simpler theories are more plausible a priori, getting a higher 
prior P(T) in a Bayesian context. Such considerations are vague, but the alternatives are 
even less principled. 

One can now apply Bayesian considerations to gravitational theory choice in the 1910s. 
One recalls that Einstein had some arguments against a scalar theory of gravity, which 
motivated his generalization to a tensor theory. Unfortunately they do not work. As 
Domenico Giulini has said, 


On his way to General Relativity, Einstein gave several arguments as to why a special-relativistic 
theory of gravity based on a massless scalar field could be ruled out merely on grounds of theoretical 
considerations. We re-investigate his two main arguments, which relate to energy conservation and 
some form of the principle of the universality of free fall. We find such a theory-based a priori 
abandonment not to be justified. Rather, the theory seems formally perfectly viable, though in clear 
contradiction with (later) experiments. (Giulini, 2008, emphasis in original) 


Scalar (spin-0) gravity is simpler than rank-2 tensor (spin-2). Having one potential is 
simpler than having ten, especially if they are self-interacting. With Einstein’s help, Gunnar 
Nordstrém eventually proposed a scalar theory that avoided the theoretical problems men- 
tioned by Giulini. Given simplicity considerations, Nordstrém’s theory was more probable 
than Einstein’s a priori: P(Ty) > P(Tgp). Einstein’s further criticisms are generally mat- 
ters of taste. So prior to evidence for General Relativity, it was more reasonable to favor 
Nordstrém’s theory. As it actually happened, Einstein’s “final” theory and the evidence 
from Mercury both appeared in November 1915, leaving little time for this logical moment 
in actual history. Einstein’s earlier Entwurf theory (Einstein and Grossmann, 1996) could 
be faulted for having negative-energy degrees of freedom and hence likely being unstable 
(a problem with roots in Lagrange and Dirichlet (Morrison, 1998)), although apparently 
no one did so. 

Where was the progress of scientific knowledge-truth held for good reasons? Mercury’s 
perihelion gave non-coercive evidence confirming GR and disconfirming Nordstrém’s the- 
ory. It was possible to save Nordstrém’s theory using something like dark matter, matter 
(even if not dark — Seeliger’s zodiacal light) of which the mass had been neglected (Rose- 
veare, 1982). Hence there was scope for rational disagreement because Nordstrém’s theory 
was antecedently more plausible 


P(Ty) > P(Ter) 
but evidence favored Einstein’s non-coercively 
0 < P(Emere|Tn) < P(EMerc\TGr). 


The scene changed in 1919 with the bending of light, which falsified Nordstrém’s theory: 
P(E,|Ty) = 0. There were not then other plausible theories that predicted light bending, 
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so P(EL|Ter) © 1 >> P(E_z). It is possible to exaggerate the significance of this result, 
as happened popularly but perhaps less so academically (Brush, 1989), where a search for 
plausible rival theories that also predicted light bending was made. (Bertrand Russell may 
have considered Whitehead’s to be an example (Russell, 1927, pp. 75—80).) Unfortunately 
many authors wrongly take Einstein’s arguments against scalar gravity seriously (Giulini, 
2008). In the long run one does not make reliable rational progress by siding with genius 
as soon as possible: Einstein made many mistakes (often correcting them himself), some of 
them lucky (Ohanian, 2008) (such as early rejection of scalar theories), followed by barren 
decades. Given this Bayesian sketch, it was rational to prefer GR over Nordstrém’s scalar 
theory only when evidence from Mercury was taken into account, and not necessarily even 
then. The bending of light excluded scalar theories but did not exclude possible rival tensor 
theories. 


13.4 General Relativity Makes Sense About Energy 


Resolving conceptual problems is a key part of scientific progress (Laudan, 1977). In 
the 1910s and again in the 1950s controversy arose over the status of energy-momentum 
conservation laws of General Relativity. Given Einstein’s frequent invocation of energy- 
momentum conservation in his process of discovery leading to General Relativity (Brading, 
2005; Einstein and Grossmann, 1996; Renn and Sauer, 2007, 1999), as well as his retro- 
spective satisfaction (Einstein, 1916), this is ironic. Partly in response to Felix Klein’s 
dissatisfaction, Emmy Noether’s theorems appeared (Noether, 1918). Her first theorem 
says that a rigid symmetry yields a continuity equation. Her second says that a wig- 
gly symmetry yields an identity among Euler-Lagrange equations, making them not all 
independent. For General Relativity there are four wiggly symmetries, yielding the con- 
tracted Bianchi identities VuGt = 0. In the wake of the conservation law controversies 
there emerged the widespread view that gravitational energy exists, but it “is not local- 
ized”. This phrase appears to mean that gravitational energy is not anywhere in particular, 
although descriptions of it often do have locations. That puzzling conclusion is motivated 
by mathematical results suggesting that where gravitational energy is depends on an arbi- 
trary conventional choice (a coordinate system), and other results that the total energy/mass 
does not. 

While the energy non-localization lore is harmless enough as long as one knows the 
mathematical results on which it is based, it has self-toxifying quality. Having accepted that 
gravitational energy is not localized, one is likely to look askance at the Noether-theoretic 
calculations that yield it: pseudotensors. The next generation of textbooks might then dis- 
pense with the calculations while retaining the lore verbally. Because the purely verbal lore 
is mystifying, at that point one formally gives license to a variety of doubtful conclusions. 
Among these are that because General Relativity lacks conservation laws, it is false — 
a claim at the origins of the just-deceased Soviet/Russian academician A. A. Logunov’s 
high-profile dissent (Logunov and Folomeshkin, 1977). One also hears (for references see 
Pitts (2010)) that the expansion of the universe, by virtue of violating conservation laws, 
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is false (a special case of Logunov’s claim). One hears that the expansion of the universe 
is a resource for creation science by providing a heat sink for energy from rapid nuclear 
decay during Noah’s Flood. Finally, one hears that General Relativity is more open to the 
soul’s action on the body than is earlier physics, because the soul’s action violates energy 
conservation, but General Relativity already discards energy conservation anyway. That 
last claim is almost backwards, because Einstein’s equations are logically equivalent to 
energy-momentum conservation laws (Anderson, 1967). (If one wants souls to act on bod- 
ies, souls had better couple to gravity also.) The question whether vanishing total energy 
of the universe (given certain topologies) would permit it to pop into being spontaneously 
is also implicated. 

Given that Noether’s theorems — the first, not just the second — apply to GR, can one 
interpret the continuity equations sensibly and block the unfortunate inferences? The 
Noether operator generalizes canonical stress-energy tensor to give conserved quanti- 
ties due to symmetry vector fields €“ (Bergmann, 1958; Goldberg, 1980; Sorkin, 1977; 
Szabados, 1991; Trautman, 1962). For simpler theories than GR, the Noether operator 
is a weight 1 tangent vector density T/'é", so the divergence of the current On (Ee) 
is tensorial (equivalent in all coordinate systems) and, for symmetries &”, there is con- 
servation Oulue”) = 0. GR (the Lagrangian density, not the metric!) has uncountably 
many ‘rigid’ translation symmetries x“ — x + c#, where c“,,, = 0, for any coordinate 
system, preserving the action S = f[ d*x£. These uncountably many symmetries yield 
uncountable conserved energy-momentum currents. Why can they not all be real? The 
lore holds that because there are infinitely many currents, really there are not any. But 
just because it is infinite does not mean it is 0 (to recall an old phrase). Getting oo = 0 
requires an extra premise, to be uncovered shortly. For GR, the Noether operator is a 
conserved but non-tensorial differential operator on €, depending on 0& also. Hence one 
obtains coordinate-dependent results, with energy density vanishing at an arbitrary point, 
etc., the usual supposed vices of pseudotensors. If one expects only one energy-momentum 
(or rather, four), it should be tensorial, with the transformation law relating faces in dif- 
ferent coordinates. But Noether tells us that there are uncountably many rigid translation 
symmetries. 

If one simply “takes Noether’s theorem literally” (Pitts, 2010) (apparently novelly, 
although Einstein and Tolman (Tolman, 1930) said nice things about pseudotensors), then 
uncountably many symmetries imply uncountably many conserved quantities. How does 
one get oo = 0? By assuming that the infinity of conserved energies are all supposed to be 
faces of the same conserved entity with a handful of components — the key tacit premise 
of uniqueness. Suppose that one is told in Tenerife that “George is healthy” and “Jorge 
esta enfermo” (is sick). If one expects the two sentences to be equivalent under transla- 
tion (analogous to a coordinate transformation), then one faces a contradiction: George is 
healthy and unhealthy. But if George and Jorge then walk into the room together, there 
is no tension: George # Jorge. An expectation of uniqueness underlies most objections 
to pseudotensors, but it is unclear what justifies that expectation. Making more sense of 
energy conservation makes its appearance in Einstein’s physical strategy in finding his field 
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equations less ironic. Indeed, conservation due to gauge invariance is a key step in spin-2 
derivations, which improve on Einstein’s physical strategy (Einstein and Grossmann, 1996; 
Deser, 1970; Pitts and Schieve, 2001). Noether commented on converses to her theorems 
(Noether, 1918); one should be able to derive Einstein’s equations from the conservation 
laws, much as the spin-2 derivations do using symmetric gravitational stress-energy (hence 
perhaps needing Belinfante—Rosenfeld technology). 

But what is the point of believing in gravitational energy unless it does energetic things? 
Can it heat up a cup of coffee? Where is the physical interaction? Fortunately these ques- 
tions have decent answers: gravitational energy is roughly the non-linearity of Einstein’s 
equations, so it mediates the gravitational self-interaction. 

Why did Hermann Bondi change from a skeptic to a believer in energy-carrying gravi- 
tational waves (Bondi, 1957)?! Given a novel plane wave solution of Einstein’s equations 
in vacuum, his equation (2), he wrote: 


there is a non-flat region of space between two flat ones, that is, we have a plane-wave zone of finite 
extent in a non-singular metric satisfying Lichnerowicz’s criteria [reference suppressed]. Consider 
now a set of test particles at rest in metric (2) before the arrival of the wave. (Bondi, 1957) 


After the passage of the wave, there is relative motion. 


Clearly, this system of test particles in relative motion contains energy that could be used, for 
example, by letting them rub against a rigid friction disk carried by one of them. (Bondi, 1957) 


This argument has carried the day with most people since that time: gravitational energy- 
transporting waves exist and do energetic things. 

This argument has roots in Feynman (Anonymous, 2015) (DeWitt, 1957, p. 143) (Feyn- 
man et al., 1995, xxv, xxvi) Kennefick (2007). John Preskill and Kip Thorne, drawing 
partly on unpublished sources, elaborate: 


At Chapel Hill, Feynman addressed this issue in a pragmatic way, describing how a gravitational 
wave antenna could in principle be designed that would absorb the energy “carried” by the wave 
[DeWi 57, Feyn 57]. In Lecture 16, he is clearly leading up to a description of a variant of this 
device, when the notes abruptly end: “We shall therefore show that they can indeed heat up a wall, 
so there is no question as to their energy content.” A variant of Feynman’s antenna was published 
by Bondi [Bond 57] shortly after Chapel Hill (ironically, as Bondi had once been skeptical about the 
reality of gravitational waves), but Feynman never published anything about it. The best surviving 
description of this work is in a letter to Victor Weisskopf completed in February, 1961 [Feyn 61]. 
(Feynman et al., 1995, p. xxv) 


Gravitational energy in waves exists in GR, and one of the main objections to local- 
ization can be managed by taking Noether’s theorem seriously: there are infinitely many 
symmetries and energies. Another problem is the non-uniqueness of the pseudotensor, 
which one might address with either a best candidate (as in Joseph Katz’s work) or a phys- 
ical meaning for the diversity of them in relation to boundary conditions (James Nester 


! T thank Carlo Rovelli for mentioning Bondi. 
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et al.). Even scalar fields have an analogous problem Callan et al. (1970). With hope 
there as well, energy in GR, although still in need of investigation, is not clearly a serious 
conceptual problem anymore. That is scientific progress a la Laudan. 


13.5 Change in Hamiltonian General Relativity 


Supposedly, change is missing in Hamiltonian General Relativity (Earman, 2002). That 
seems problematic for two reasons: change is evident in the world, and change is evident in 
Lagrangian GR in that most solutions of Einstein’s equations lack a time-like Killing vector 
field (Ohanian and Ruffini, 1994, p. 352). A conceptual problem straddling the internal 
vs. external categories is “empirical incoherence’, being self-undermining. According to 
Richard Healey, 


[t]here can be no reason whatever to accept any theory of gravity... which entails that there can be 
no observers, or that observers can have no experiences, some occurring later than others, or that 
there can be no change in the mental state of observers, or that observers cannot perform different 
acts at different times. It follows that there can be no reason to accept any theory of gravity... which 
entails that there is no time, or no change. (Healey, 2002, p. 300) 


Hence accepting the no-change conclusion about Hamiltonian GR would undermine rea- 
sons to accept Hamiltonian GR. Change in the world is safe. But what about the surprising 
failure of Hamiltonian—Lagrangian equivalence? 

A key issue involves where one looks for change, and relatedly, one what means 
by “observables”. According to Earman (who would not dispute the point about the 
scarcity of solutions with time-like Killing vectors), “[n]o genuine physical magni- 
tude countenanced in GTR changes over time” (Earman, 2002). Since the lack of 
time-like Killing vectors implies that the metric does change, clearly genuine physi- 
cal magnitudes must be scarce, rarer than tensors. Tim Maudlin appeals to change in 
solutions to Einstein’s equations: “stars collapse, perihelions precess, binary star sys- 
tems radiate gravitational waves...” but “a sprinkling of the magic powder of the 
constrained Hamiltonian formalism has been employed to resurrect the decomposing 
flesh of McTaggart...” (Maudlin, 2002). Maudlin’s appeal to common sense and Ein- 
stein’s equations is helpful, as is Karel Kuchar’s (Kuchar, 1993), but one needs more 
detail, motivation and (in light of Kuchaf’s disparate treatments of time and space) 
consistency. 

Fortunately the physics reveals a relevant controversy, with reformers recovering 
Hamiltonian—Lagrangian equivalence (Castellani, 1982; Gracia and Pons, 1988; Mukunda, 
1980; Pons and Salisbury, 2005; Pons et al., 1997; Pons and Shepley, 1998; Pons et al., 
2010; Sugano et al., 1986). Hamiltonian—Lagrangian equivalence was manifest originally 
(Anderson and Bergmann, 1951; Rosenfeld, 1930; Salisbury, 2010); its loss needs study. 
In constrained Hamiltonian theories (Sundermeyer, 1982), some canonical momenta are 
(in simpler cases) just 0 due to independence of £ from some qj; these are “primary con- 
straints”. In many cases of interest (including electromagnetism, Yang—Mills fields, and 
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General Relativity), some functions of p, g, ip, 0:4, 0;0;q are also O in order to preserve the 
primary constraints over time. Often these “secondary” (or higher) constraints are famil- 
iar, such as the phase space analog djp' = 0 of Gauss’s law V - E = 0, Gauss—Codazzi 
equations embedding space into space-time in General Relativity, etc. Some constraints 
have something to do with gauge freedom (time-dependent redescriptions leaving the state 
or history alone). One takes Poisson brackets (gq, p derivatives) of all constraints pairwise. If 
the result is in every case 0 (perhaps using the constraints themselves), then all constraints 
are “first-class”, as in Clerk Maxwell’s electromagnetism, Yang—Mills, and GR in their 
most common formulations. In General Relativity, the Hamiltonian, which determines time 
evolution, is nothing but a sum of first-class constraints (and boundary terms). Given that 
first-class constraints are related to gauge transformations, the key question is how they are 
related. Does each do so by itself, or do they rather work as a team? There is a widespread 
belief that each does so individually (Dirac, 1964). Then the Hamiltonian generates a sum 
of redescriptions leaving everything as it was, hence there is no real change. This is a classi- 
cal aspect of the “problem of time.” Some try to accept this conclusion, but recall Healey’s 
critique. 

Because Einstein’s equations and common sense agree on real change, something must 
have gone wrong in Hamiltonian GR or the common interpretive glosses thereon, but what? 
Here the Lagrangian-equivalent reforming party has given most of the answer, namely, that 
what generates gauge transformations is not each first-class constraint separately, but the 
gauge generator G, a specially tuned sum of first-class constraints, secondary and primary 
(Anderson and Bergmann, 1951; Castellani, 1982; Pons, 2005; Pons et al., 1997, 2010). 
Thus electromagnetism has two constraints at each point but only one arbitrary function; 
GR has eight constraints at each point but only four arbitrary functions. Indeed one can 
show that an isolated first-class constraint makes a mess (Pitts (2014b,a), such as spoiling 
the relation expected relation g = oa making the canonical momentum equal to the elec- 
tric field or the extrinsic curvature of space within space-time. These canonical momenta 
are auxiliary fields in the canonical action [ dtd>x(pq — H), and hence get their physical 
meaning from q. Because each first-class constraint makes a physical difference by itself 
(albeit a bad one), the GR Hamiltonian no longer is forced to generate a gauge transfor- 
mation by being a sum of them. There is change in the Hamiltonian formalism whenever 
there is no time-like Killing vector, just as one would expect from Lagrangian equivalence. 


We have been guided by the principle that the Lagrangian and Hamiltonian formalisms should be 
equivalent ... in coming to the conclusion that they in fact are. (Pons and Shepley, 1998, p. 17) 


By the same token, separate first-class constraints do not change pq — H by (at most) a 
total derivative, but G does (Pitts 2014a, 2014b). 

To get changing observables in GR, one should recall the distinction between internal 
and external symmetries. Requiring that observables have 0 Poisson bracket with the elec- 
tromagnetic gauge symmetry generator is just to say that things that we cannot observe (in 
the ordinary sense) are unobservable (in the technical sense). By contrast, requiring that 


Progress and Gravity 273 


observables have 0 Poisson bracket with the gauge generator in GR implies that the Lie 
derivative of an observable is 0 in every direction. Thus anything that varies spatiotempo- 
rally is “unobservable” — a result that cannot be taken seriously. The problem is generated 
by hastily generalizing the definition from internal to external symmetries. Instead one 
should permit observables to have Lie derivatives that are not 0 but just the Lie derivative of 
a geometric object — an infinitesimal Hamiltonian form of the identification of observables 
with geometric objects in the classical sense (Nijenhuis, 1952), viz., set of components in 
each coordinate system and a transformation law. 


13.6 Einstein’s Real A Blunder in 1917 


One tends to regard perturbative expansions and geometry as unrelated at best, if not 
negatively related. 


The advent of supergravity [footnote suppressed] made relativists and particle physicists meet. For 
many this was quite a new experience since very different languages were used in the two com- 
munities. Only Stanley Deser was part of both camps. The particle physicists had been brought up 
to consider perturbation series while relativists usually ignored such issues. They knew all about 
geometry instead, a subject particle physicists knew very little about. (Brink, 2006, p. 40) 


But some examples will show how perturbative expansions can help to reveal the geometric 
content of a theory that is otherwise often misunderstood, can facilitate the conception of 
novel geometric objects that one might otherwise fail to conceive, and permit conceptual 
and ontological insight. 

Perturbative expansions can help to reveal the geometric content of a theory that one 
might well miss otherwise. Einstein in his 1917 cosmological constant paper first rein- 
vented a long-range modification of Newtonian gravity (Einstein, 1923) — one might call 
it (anachronistically) non-relativistic massive scalar gravity — previously proposed in the 
nineteenth century by Hugo von Seeliger and Carl Neumann. But he then made a false 
analogy to his new cosmological constant A, a mistake never detected till the 1940s (Heck- 
mann, 1942), not widely discussed till the 1960s, and still committed at times today. 
According to Einstein, A was “completely analogous to the extension of the Poisson 
equation to Ag@ — Ad = 47 Kp ” (Einstein, 1923). Engelbert Schiicking, a former stu- 
dent of Heckmann, provided a firm evaluation. “This remark was the opening line in a 
bizarre comedy of errors” (Schucking, 1991). The problem is that A is predominantly Oth 
order in ¢ (having a leading constant term), whereas the modified Poisson is Ist order in @. 
A gives a weird quadratic potential for a point source, but the modified Poisson equation 
gives a massive graviton with plausible Neumann—Yukawa exponential fall-off (Freund 
et al., 1969; Schucking, 1991). “However generations of physicists have parroted this non- 
sense” (Schucking, 1991). Massive theories of gravity generically involve two metrics, 
whereas A involves only one. Understanding geometric content sometimes is facilitated 
by a perturbative expansion. 
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13.7 Series, Non-linear Geometric Objects, and Atlases 


Perturbative series expansions can also be useful for conceptual innovations. For example, 
non-linear realizations of the “group” of arbitrary coordinate transformations have tended 
to be invented with the help of a binomial series expansion for taking the symmetric square 
root of the metric tensor (DeWitt and DeWitt, 1952; Ogievetskii and Polubarinov, 1965; 
Ogievetsky and Polubarinov, 1965). The exponentiating technology of non-linear group 
realizations (Isham ef al., 1971) is also at least implicitly perturbative. While classical 
differential geometers defined non-linear geometric objects (basically the same as particle 
physicists’ non-linear group realizations as applied to coordinate transformations) (Aczél 
and Gotab, 1960; Szybiak, 1966; Tashiro, 1952), they generally provided no examples. 

Perhaps the most interesting example involves the square root of the (inverse) metric 
tensor, or rather a slight generalization for indefinite metrics. The result is strictly a square 
root and strictly symmetric using x*+ = ict; otherwise it is a generalized square root using 
the signature matrix nyg = diag(—1,1, 1,1). One has r# Noprh” = gt” and rl#¥l = 0. 
Under coordinate transformations, the new components r”’ are non-linear in the old ones 
(Ogievetsky and Polubarinov, 1965; Pitts, 2012). These entities augment tensor calculus 
and have covariant and Lie derivatives (Szybiak, 1963; Tashiro, 1952). 

Defining the symmetric square root of a metric tensor might seem more of a curios- 
ity for geometric completists than an important insight — but the symmetric square root 
of the metric makes an important conceptual difference with spinor fields used to repre- 
sent fermions. Spinors in GR are widely believed to require an orthonormal basis (Cartan 
and Mercier, 1966; Lawson and Michelsohn, 1989; Weyl, 1929). But they do not, using r” 
(Bilyalov, 2002; DeWitt and DeWitt, 1952; Ogievetskii and Polubarinov, 1965; Ogievetsky 
and Polubarinov, 1965). One can have spinors in coordinates, but with metric-dependent 
transformations beyond 15-parameter conformal group (Borisov and Ogievetskii, 1974; 
Isham et al., 1971; Ogievetskii and Polubarinov, 1965; Pitts, 2012), the conformal Killing 
vectors for the unimodular metric density gy) = (—g)74 8yv-. Such spinors have Lie 
derivatives beyond conformal Killing vectors — often considered the frontier for Lie dif- 
ferentiation of spinors (Penrose and Rindler, 1986, p. 101) — but they sprout new terms in 
£2, ). One can treat symmetries without surplus structure and an extra local O(1, 3) gauge 
group to gauge it away. 

The (signature-generalized) square root of a metric, although not very familiar, fits fairly 
nicely into the realm of non-linear geometric objects, yielding a set of components in every 
coordinate system (with a qualification) and a non-linear transformation law. The entity 
is useful especially if one wants to know what sort of space-time structure is necessary 
for having spin-5 particles in curved space-time (Woodard, 1984). Must one introduce an 
orthonormal basis, then discard much of it from physical reality by taking an equivalence 
class under local Lorentz transformations? Or can one get by without introducing anything 
beyond the metric and then throwing (most of?) it away? 

A curious and little known feature of this generalized square root touches on an assump- 
tion usually made in passing in differential geometry. Although one can (often) make a 
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binomial series expansion in powers of the deviation of the metric from the signature 
matrix, and (more often) one can take a square root using generalized eigenvalues, there 
are exotic coordinate systems in which the generalized square root does not exist due to the 
indefinite signature (Bilyalov, 2002; Deffayet et al., 2013; Pitts, 2012). This fact is trivial to 
show in two space-time dimensions (signature matrix diag(—1, 1)) using the quadratic for- 
mula: just look for negative eigenvalues. The fact generally has not been noticed previously 
because most treatments (a great many are cited in Pitts, 2012) worked near the iden- 
tity. Such a point could have been noticed some time ago by Hoek, but a fateful innocent 
inequality was imposed that restricted the coordinates (with signature + — ——). 


We shall assume that [the metric tensor g,)] is pointwise continuously connected with the 
Minkowski metric (in the space of four-metrics of Minkowski signature) and has ggg > 0. (Hoek, 
1982) 


The lesson to learn is that there can be feedback from the fibers over space-time to the 
atlas of admissible coordinate systems for non-linear geometric objects given an indefinite 
signature. Naively assuming a maximal atlas causes interesting and quite robust entities 
not to exist. Such a result sounds rather dramatic when expressed in modern vocabulary. 
But coordinate inequalities are old (Hilbert, 2007), familiar (Meller, 1972), and not very 
dramatic classically; coordinates can have qualitative physical meaning while lacking a 
quantitative one. A principal square root is related to the avoidance of negative eigenval- 
ues of g" ny» (Higham, 1987, 1997). Null coordinates are fine; the coordinate restriction 
is mild. Amusingly, coordinate order can be important: if (x,t, y, z) is bad, switching to 
(t,x, y, z) suffices (Bilyalov, 2002). 


13.8 Massive Gravity: 1965-72 Discovery of 2010 Pure Spin-2 


The recent (re)invention of pure spin-2 massive gravity (de Rham et al., 2011; Hassan 
and Rosen, 2011) used the symmetric square root of the metric, as did the first invention 
(Ogievetsky and Polubarinov, 1965), though not the second (Pitts, 2011; Zumino, 1970). 
This problem has a curious history, from which Ogievetsky and Polubarinov (1965) have 
been unjustly neglected. That paper highly developed the symmetric square root of the 
metric perturbatively. It derived a two-parameter family of massive gravities, which, I note, 
includes two of the original three modern massive pure spin-2 gravities with a flat back- 
ground metric. In light of the dependence of the space-time metric on the lapse function 
N ina3-+ 1 ADM split, there were only two Ogievetsky—Polubarinov theories with any 
chance of being linear in the lapse (hence having pure spin-2 (Boulware and Deser, 1972)), 
although the naive cross-terms are rather discouraging. These are the n = 5 Dp = —2 the- 
ory built around 67 (g"" va J=2) 2,a theory reinvented as equation (3.4) of Hassan and 
Rosen (2011), and the n = -5, p = 0 theory built around 64 (guy /=8)?. A truly 
novel third theory is now known (Hassan and Rosen, 2011). A second novel modern result 
is the non-linear field redefinition of the shift vector (Hassan et al., 2012), which allows 
the square root of the metric to be linear in the lapse. 
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More striking than the proposal of such theories long ago is the fact that in 1971-1972 
Maheshwari already showed that one of the Ogievetsky—Polubarinov theories had pure 
spin-2 non-linearly (Maheshwari, 1972)! Thus the Boulware—Deser—Tyutin—Fradkin ghost 
(Boulware and Deser, 1972; Tyutin and Fradkin, 1972) (the negative energy sixth degree 
of freedom that is avoided by Fierz and Pauli to linear order but comes to life non-linearly) 
was avoided before it was announced. Unfortunately Maheshwari’s paper made no impact, 
being cited only by Maheshwari in the mid-1980s. With Vainshtein’s mechanism also sug- 
gested in 1972 (Vainshtein, 1972), there was no seemingly insoluble problem for massive 
spin-2 gravity in the literature. Massive spin-2 gravity was largely ignored from 1972 until 
c. 2000 largely because of failure to read Maheshwari’s paper. This example illustrates the 
point (Chang, 2012) that the history of a science has resources for current science. 


13.9 Conclusions 


The considerations above support the idea that progress in knowledge about gravity can 
be made by overcoming various barriers, whether between general relativity and particle 
physics, or between physics and the history and philosophy of science. GR does not need 
to be treated a priori as exceptional, either in justifying choosing GR over rivals or in 
interpreting it. GR is well motivated non-mysteriously using particle physicists’ arguments 
about the exclusion of negative-energy degrees of freedom, arguments that leave only a 
few options possible. To some degree the same holds even for the context of discovery of 
GR, given the renewed appreciation of Einstein’s “physical strategy”. 

Because conceptual problems of GR often can be resolved, there is no need to treat it as 
a priori exceptional in matters of interpretation, either. Regarding gravitational radiation, 
Feynman reflected on the unhelpfulness of GR-exceptionalism: 


What is the power radiated by such a wave? There are a great many people who worry needlessly at 
this question, because of a perennial prejudice that gravitation is somehow mysterious and different— 
they feel that it might be that gravity waves carry no energy at all. We can definitely show that they 
can indeed heat up a wall, so there is no question as to their energy content. (Feynman et al., 1995, 
pp. 219, 220) 


The conservation of energy and momentum - rather, energies and momenta — makes sense 
in relation to Noether’s theorems. Change, even in local observables, is evident in the 
Hamiltonian formulation, just as in the Lagrangian/four-dimensional geometric form. 

To say that GR should not be treated as a priori exceptional is not to endorse the 
strongest readings of the claim that GR is just another field theory, taking gauge-fixing 
and perturbative expansions as opening moves. The mathematics of GR logically entails 
some distinctiveness, such as the difference between external coordinate symmetries (with 
a transport term involving the derivative of the field) and internal symmetries as in elec- 
tromagnetism and Yang-Mills. Identifying such distinctiveness requires reflecting on the 
mathematics and its meaning, as well as gross features of embodied experience, but it does 
not require conjectures about the trajectory of historical progress or divination of the spirit 
of GR. 
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Series expansions have their uses in GR. Einstein’s failure to think perturbatively in 
1917 about the cosmological constant generated lasting confusion and surely helped to 
obscure massive spin-2 gravity as an option. Many of the (re)inventions of the symmetric 
generalized square root of the metric began perturbatively. It permits spinors in coordinates, 
a fundamental geometric result, just as was Weyl’s (1929) impossibility claim. Perturbative 
methods should not always be used or always avoided; they are one tool in the tool box for 
the foundations of gravity and space-time. 
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Part IV 


Quantum Foundations and Quantum Gravity 


14 
Is Time’s Arrow Perspectival? 


CARLO ROVELLI 


14.1 Introduction 


We observe entropy decrease towards the past. Does this imply that in the past the world 
was in a non-generic microstate? I point out an alternative. The subsystem to which we 
belong interacts with the universe via a relatively small number of quantities, which define 
a coarse graining. Entropy happens to depend on coarse graining. Therefore, the entropy 
we ascribe to the universe depends on the peculiar coupling between us and the rest of the 
universe. Low past entropy may be due to the fact that this coupling (rather than microstate 
of the universe) is non-generic. I argue that for any generic microstate of a sufficiently 
rich system there are always special subsystems defining a coarse graining for which the 
entropy of the rest is low in one time direction (the “past’). These are the subsystems 
allowing creatures that “live in time” — such as those in the biosphere — to exist. I reply to 
some objections raised to an earlier presentation of this idea, in particular by Bob Wald, 
David Albert and Jim Hartle. 


14.2 The Problem 


An imposing aspect of the Cosmos is the mighty daily rotation of Sun, Moon, planets, 
stars and all galaxies around us. Why does the Cosmos so rotate? Well, it is not really the 
Cosmos that rotates, it is us. The rotation of the sky is a perspectival phenomenon: we 
understand it better as due to the peculiarity of our own moving point of view, rather than 
a global feature of all celestial objects. 

A vivid feature of the world is its being in color: each dot of each object has one of 
the colors out of a three-dimensional (3D) color-space. Why? Well, it is us that have three 
kinds of receptors in our eyes, giving the 3D color space. The 3D space of the world’s 
colors is perspectival: we understand it better as a consequence of the peculiarity of our 
own physiology, rather than the Maxwell equations. 

The list of conspicuous phenomena that have turned out to be perspectival is long; 
recognizing them has been a persistent aspect of the progress of science. 

A vivid aspect of reality is the flow of time; more precisely: the fact that the past is dif- 
ferent from the future. Most observed phenomena violate time reversal invariance strongly. 
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Could this be a perspectival phenomenon as well? Here I suggest that this is a likely 
possibility. 

Boltzmann’s H-theorem and its modern versions show that for most microstates away 
from equilibrium, entropy increases in both time directions. Why then do we observe lower 
entropy in the past? For this to be possible, most microstates around us appear to be very 
non-generic. This is the problem of the arrow of time, or the problem of the source of the 
second law of thermodynamics [1, 2]. The common solution is to believe that the universe 
was born in an extremely non-generic microstate [3]. Roger Penrose even considered the 
possibility of a fundamental cosmological law breaking time-reversal invariance, forcing 
initial singularities to be extremely special (vanishing Weil curvature) [4]. 

Here I point out that there is a different possibility: past low entropy might be a 
perspectival phenomenon, like the rotation of the sky. 

This is possible because entropy depends on the system’s microstate but also on the 
coarse graining under which the system is described. In turn, the relevant coarse graining 
is determined by the concrete existing interactions with the system. The entropy we assign 
to the systems around us depends on the way we interact with them — as the apparent 
motion of the sky depends on our own motion. 

A subsystem of the universe that happens to couple to the rest of the universe via macro- 
scopic variables determining an entropy that happens to be low in the past is a system to 
which the universe appears strongly time oriented; as it appears to us. Past entropy may 
appear low because of our own perspective on the universe. 

Specifically, I argue below that the following conjecture is plausible: 


Conjecture Ina sufficiently complex system, there is always some subsystem whose inter- 
action with the rest determines a coarse graining with respect to which the system satisfies 
the second law of thermodynamics (in some time direction). 


An example where this is realized is given below. 

If this is correct, we have a new way for facing the puzzle of the arrow of time: the 
universe is in a generic state, but is sufficiently rich to include subsystems whose coupling 
defines a coarse graining for which entropy increases monotonically. These subsystems are 
those where information can pile up and “information gathering creatures” such as those 
composing the biosphere can exist. 

All phenomena related to time flow, all phenomena that distinguish the past from the 
future, can be traced to (or described in terms of) entropy increase. Therefore the difference 
between past and future may follow from the peculiarities of our coupling to the rest of the 
universe, rather than from a peculiarity of the microstate of the universe; like the rotation 
of the cosmos. 


14.3 A Preliminary Conjecture 


To start with, consider classical mechanics. Quantum theory is discussed in the last section. 
It is convenient to use Gibbs’ formulation of statistical mechanics rather than Boltzmann’s, 
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because Boltzmann takes for granted the split of a system in a large number of equal 
subsystems (the individual molecules), and this may precisely obfuscate the key point in 
the context of general relativity and quantum field theory, as we shall see. 

Consider a classical system with many degrees of freedom in a (“microscopic”) state s, 
element of a phase space I’, evolving in time as s(t). Let {A,}, be a set of (“macroscopic”) 
observables — real functions on I’, labeled by the index n. This set defines a coarse graining. 
That is, it partitions [ in unequal regions where the A, are constant. The largest of these 
regions is the equilibrium region. The entropy of a state s can be defined as the volume of 
the region where it is located. With a (suitably normalized and time invariant) measure ds, 
entropy is then 


Sa, = log i ds’ a 5(An(s’) — An(s)), (14.1) 
Pr n 


where the family of macroscopic observables A, is indicated in subscript to emphasize that 
the entropy depends on the choice of these observables. Notice that this definition applies 
to any microstate.! 

As the microstate s evolves in time so does its entropy 


Sa,() = log [ ds! a 5(An(s’) — An(s(1))). (14.2) 
r n 


Boltzmann’s H-theorem and its modern versions imply that under suitable ergodic condi- 
tions if we fix the choice of the macroscopic observables Ay, for most microstates out of 
equilibrium at fo, and for any finite At, we have S4, (to + At) > Sa, (to) irrespectively of 
the sign of At. 

I want to bring the attention, instead, to the dependence of entropy on the family of 
observables, and enunciate the following first conjecture. If the system is sufficiently com- 
plex and ergodic, for most paths s(f) that satisfy the dynamics and for each orientation of 
t, there is a family of observables A, such that 


So. (14.3) 


In other words, any motion appears to have initial low entropy (and non-decreasing 
entropy) under some coarse graining. 

The conjecture becomes plausible with a concrete example. Consider a set & of N dis- 
tinguishable balls that move in a box, governed by a time reversible ergodic dynamics. Let 
the box have an extension x € [—1, 1] in the direction of the x coordinate, and be ideally 


! This equation defines entropy up to an an additive factor, because phase space volume has the dimension of 
[Action] , where N is the number of degrees of freedom. This is settled by quantum theory, which introduces 
a unit of action, the Planck constant, whose physical meaning is to determine the minimum empirically dis- 
tinguishable phase space volume, namely the maximal amount of information in a state. See Haggard and 
Rovelli [5]. 
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divided in two halves by x = 0. For any given subset o C & of balls, define the observ- 

able A, to be the mean value of the x coordinate of the balls in o. That is, if x, is the x 

coordinate of the ball b, define 

_ Libes Xb 
donee 1 
Let s(t) be a generic physical motion of the system, say going from t=f, to t=tp > fa. 

Let o, be the set of the balls that is at the right of x= 0 at t=t,. The macroscopic observable 

Aq = Ag, defines an entropy that is in the large N limit and for most motions s(f) satisfies 


Ag (14.4) 


Sa, (0) . 
dt ~— 


0. (14.5) 


This is the second law of thermodynamics. 

But let us now fix the motion s(f), and define a different observable as follows. Let op be 
the set of the balls that is at the left of x =0 at t=). The macroscopic observable Ap = Ag, 
defines an entropy that is easily seen to satisfy 


SA, (1) < 


; 14. 
dt <0 vee) 


This is again the second law of thermodynamics, but now in the reversed time — f. It holds 
for the generic motion s(t), with a specific observable. 

This is pretty obvious: if at time ft, we ideally color in white all the balls at the right 
of x = 0 (see Figure 14.1), then the state at ¢, is low entropy with respect to this coarse 
graining, and the motion mixes the balls and raises the entropy as t moves from fg to fp. 
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Figure 14.1 The same history, seen with different filters: for a filter seeing the white balls that are 
on the right at time fg, entropy is low at tg. A filter that sees the gray balls on the left at t, defines an 
entropy low at ty. Since the direction of time flow is determined by increasing entropy, time flows in 
a different direction with respect to the two different observables. 
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But if instead we color in gray the balls that are at the left of x =0 at the time tp, then the 
reverse is true and entropy increases in the reverse ¢ direction. 

The point is simple: for any motion there is a macroscopic family of observables with 
respect to which the state at a chosen end of the motion has low entropy: it suffices to 
choose observables that single out well the state at the chosen end. I call these observables, 
“time oriented”. They are determined by the state itself. 

This simple example shows that, generically, past low entropy is not a feature of a spe- 
cial physical history of microstates of the system. Each such history may appear to be 
time oriented (that is: have increasing entropy) under a suitable choice of macroscopic 
observables. 

Can this observation be related to the fact that we see entropy increase in the world? An 
objection to this idea is: how can a physical fact of nature, such as the second law, depend 
on a choice of coarse graining, which — so far — seems subjective and arbitrary? In the next 
section I argue that there is nothing arbitrary in the choice of the coarse graining and the 
macroscopic observables. These are fixed by the coupling between subsystems. Different 
choices of coarse graining represent different possible subsystems. 

To pursue the analogy that opens this chapter, different reference systems from which 
the universe can be observed are concretely realized by different rotating bodies, such as 
the Earth. 


14.4 Time-Oriented Subsystems 


The fact that thermodynamics and statistical mechanics require a coarse graining, namely 
a “choice” of macroscopic observables, appears at first sight to introduce a curious element 
of subjectivity into physics, clashing with the objectivity of the predictions of science. 

But of course there is nothing subjective in thermodynamics. A cup of hot tea does not 
cool down because of what I know or do not know about its molecules. The “choice” of 
macroscopic observables is dictated by the ways the system under consideration couples. 
The macroscopic observables of the system are those coupled to the exterior (in ther- 
modynamics, those that can be manipulated and measured). The thermodynamics and 
the statistical mechanics of a system defined by a set of macroscopic observables Ay, 
describe (objectively) the way the system interacts when coupled to another system via 
these observables, and the way these observables behave. 

For instance, the behaviour of a box full of air is going to be described by a certain 
entropy function if the air is interacting with the exterior via a piston that changes its 
volume V. But the same air is going to be described by a different entropy function if it 
interacts with the exterior via two pistons with filters permeable to oxygen and nitrogen, 
respectively (see Figure 14.2). In this case, the macroscopic observables are others and 
chemical potentials enter the game. It is not our abstract “knowledge” on the relative abun- 
dance of oxygen and nitrogen that matters: it is the presence or not of a physical coupling of 
this quantity to the exterior, and the possibility of their independent variation, to determine 
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Figure 14.2 The same system — here a volume of air — is described by different entropy functions, 
describing different interactions it can have; here via its total volume or via the volume of its distinct 
chemical components. 


which entropy describes the phenomena. Different statistics and thermodynamics of the 
same box of air describe different interactions of the box with the exterior. 

In the light of this consideration, let us reconsider the box of the previous section 
replacing the abstract notion of “observable” by a concrete interaction between subsystems. 

Say we have the N balls in a box as above, but now we add a new set of QN “small” 
balls, with negligible mass, that do not interact among themselves but interact with the 
previous (“large”) balls as follows. Each small ball is labeled by a subset o C © and is 
attracted by the balls in o and only these, via a force law such that the total attraction is in 
the direction of the center of mass A, of the balls in o (see Figure 14.3). 

Generically, a small ball interacts with a large number of large balls, but it does so only 
via a single variable: A,. Therefore it interacts with a statistical system, for which Ag is the 
single macroscopic observable. For each small ball o, the “rest of the universe” behaves as 
a thermal system with entropy S4,. 

It follows from the considerations of the previous sections that given a generic motion 
s(t) there will generically be at least one small ball, the ball og for which the entropy of 
the rest of the box is never decreasing in ¢, in the thermodynamical limit of large NV. (There 
will also be another small ball, o, for which the entropy of the rest of the box is never 
increasing in t.) 

Imagine that the box is the universe and each “small” ball o is itself a large system with 
many degrees of freedom. Then generically there is at least one of these, namely oy (in fact 
many) for which the rest of the universe has a low-entropy initial state. In other words, it is 
plausible to expect the validity of the conjecture stated in the introduction, which I repeat 
here: 


2 2N is the number of subsets of 5, namely the cardinality of its power set. 
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Figure 14.3 The “small” balls are represented on top of the box. The white and gray ones are 
attracted, respectively, by the large white and gray balls. Both interact with a statistical system where 
entropy changes, but entropy increases in the opposite direction with respect to each of them. 


Conjecture Ina sufficiently complex system, there is always some subsystem whose inter- 
action with the rest determines a coarse graining with respect to which the system satisfies 
the second law of thermodynamics (in some time direction). 


That is: low past entropy can be fully perspectival. 

Now, since og interacts thermodynamically with a universe which was in a low-entropy 
state in the past, the world seen by o, appears organized in time: observed phenomena 
display marked and consistent arrows of time, which single out one direction. og interacts 
with a world where entropy increases, hence “time flows” in one specific direction. The 
world seen by o, may include dissipation, memory, traces of the past, and all the many 
phenomena that characterize the universe as we see it. Within the subsystem oz, time ori- 
ented phenomena that require the growth of entropy, such as evolution or accumulation of 
knowledge, can take place. I call such a subsystem “time oriented”. 

Could this picture be related to the reason why we see the universe as a system with 
a low-entropy initial state? Could the low entropy of the universe characterize our own 
coupling with the universe, rather than a peculiarity of the microstate of the universe? 

In the next section I answer some possible objections to this idea. Some of these were 
raised and discussed at the 2015 Tenerife Conference on the Philosophy of Cosmology. 


14.5 Objections and Replies 


1. Isn’t this just shifting the problem? Instead of the mystery of a strange (low past entropy) 
microstate of the universe, we have now the new problem of explaining why we belong 
to a peculiar system? 
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Yes. But it is easier to explain why the Earth happens to rotate, rather than having to 
come up with a rationale for the full cosmos to rotate. The next question addresses the 
shifted problem. 


2. For most subsystems the couplings are such that entropy was not low at one end of 
time. Why then should we belong to a special subsystem that couples in such a peculiar 
manner? 


Because this is the condition for us to be what we are. We live in time. Our own existence 
depends on the fact of being in a situation of strong local entropy production: biological 
evolution, biochemistry, life itself, memory, knowledge acquisition, culture... As empha- 
sized by David Albert, low past entropy is what allows us to reconstruct the past, and have 
memory and therefore gives us our sense of identity. Being part of a time-oriented subsys- 
tem is the condition for all this. This is a mild use of anthropic reasoning. It is analogous 
to asking why do we live on a planet’s surface (non-generic place of the universe) and 
answering that this is simply what we are: creatures living on ground, needing water and 
so on. Our inhabiting these quarters of the universe is no more strange than me being born 
in a place where people happen to speak my own language. 


3. Assuming that we choose a coarse graining for which entropy is low at initial time tg. 
Would not entropy then move very fast to a maximum, in the time scale of the molec- 
ular interactions, and then just fluctuate around the maximum? (Point raised by Bob 


Wald). 


There are different time scales. The thermalization time scale can be hugely different 
from the time scale of the molecular interactions, and in fact it is clearly so in our universe. 
Given a history of an isolated system, a situation where entropy increases can exist only for 
a time scale shorter than the thermalization time. This is precisely the situation in which 
we are in the universe: the Hubble time is much longer than the time scale of microphysics, 
but much shorter than the thermalization time of the visible universe. So, the time scales 
are fine. 


4. The interactions in the real universe are not as arbitrary as in the example of the heavy 
and small balls. In fact, in the universe there are no more than a small number of 
fundamental interactions. (Point raised by David Albert). 


The fundamental interactions are only a few, but the interaction channels they open are 
innumerable. The example of the colors makes this clear: the relevant elementary inter- 
action is just the electromagnetic interaction. But our eyes pick up an incredibly tiny 
component of the electromagnetic waves. They pick up three variables out of an infinite 
number: they recognize a 3D space of colors (with some resolution) out of the virtually 
infinite dimensional space of waveforms. So we are precisely in the situation of the small 
balls, which only interact with a tiny fraction of the variables of the external world. It 
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can be argued that these are the most relevant for us. This is precisely the point: it is by 
interacting with some specific variables that we may pick up time oriented features of the 
world. Another simple example is a normal radio: it can easily tune on a single band, 
out of the entire electromagnetic spectrum. We are in a similar situation. For instance, we 
observe a relatively tiny range of phenomena, among all those potentially existing in the 
full range of time scales existing in the 60 orders of magnitude between the Planck time 
and cosmological time. 


5. We see entropy increase in cosmology, which is the description of the whole, without 
coarse graining. 


Current scientific cosmology is not the dynamics of everything: it is the description of an 
extremely coarse grained picture of the universe. Cosmology is a feast of coarse graining. 


6. The observables that we use to describe the world are coarse grained but they are the 
natural ones. 


Too often “natural” is just what we are used to. Considering something “natural” is to 
be blind to subjectivity. For somebody it is natural to write from left to right. For others, 
the opposite. 


7. Our interactions pick up variables that are determined by the spatio-temporal struc- 
ture of the world: spacetime integrals of conserved quantities. Quasi-classical domains 
are determined by these. Are these sufficiently generic for the mechanism you suggest? 
(Point raised by Jim Hartle). 


Yes they are, because spacetime averages of conserved quantities carry a very large 
amount of information, as our eyes testify. But this point is better raised in the context 
of quantum gravity, where the spacetime-regions structure is itself an emergent classical 
phenomenon that requires a quasi-classical domain. The emergence of a spatio-temporal 
structure from a quantum gravitational context may be related to the emergence of the 
second law in the sense I am describing here. In both cases there is a perspectival aspect of 
the emergence. 


8. Can the abstract picture of the coarse graining determining entropy be made concrete 
with an example? 


The biosphere is an oriented subsystem of the universe. Consider the thermodynamical 
framework of life on Earth. There is a constant flow of electromagnetic energy on Earth: 
incoming radiation from the Sun and outgoing radiation towards the sky. Microscopically, 
this is a certain complicated solution of the Maxwell equation. But as far as life is consid- 
ered, most details of this solution (such as the precise phase of a single solar photon falling 
on the Pacific ocean) are irrelevant. 
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Figure 14.4 Electromagnetic energy enters and exits the Earth. The biosphere interacts with coarse 
grained aspects of it (frequency) with respect to which there is entropy production, and therefore 
time orientation, on the planet. 


What matters to life on Earth are energy and a certain range of frequency, integrated over 
small regions. This determines a coarse graining and therefore a notion of entropy. Now 
the incoming energy is the same as the outgoing energy, but not so for the frequency. The 
Earth receives energy E (from the Sun) at higher frequency vg and emits energy (towards 
the sky) at lower frequency vp, (see Figure 14.4). This is a fact about the actual solution 
of the Maxwell equations in which we happen to live. If we take energy and frequency as 
macroscopical observables, then an entropy is defined by such coarse graining. Roughly, 
this entropy counts the number of photons; at frequency v the number of photons in a wave 
of energy E is N= E/fv. If the received energy is emitted at lower frequency, the emitted 
entropy S» is higher than the received entropy S,. The process produces entropy: Sp > Sa. 
This entropy production is not a feature of the solution of the Maxwell equations alone: it 
is a feature of this solution and a set of macroscopic observables (integrated energy and 
frequency: oriented observables for this solution of the Maxwell equation) to which living 
systems couple. 

Any system on Earth whose dynamics is governed by interactions with E and v has 
a source of negative entropy at its disposal. This is what is exploited by the biosphere 
on Earth to build structure and organization. The point I am emphasizing is that what is 
relevant and peculiar here is not the individual solution of the Maxwell equation describing 
the incoming and outgoing waves: it is the peculiar manner in which the interaction with 
this energy is coarse grained by the biosphere. 
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14.6 Quantum Theory and General Relativity 


Quantum phenomena provide a source of entropy distinct from the classical one gener- 
ated by coarse graining: entanglement entropy. The state space of any quantum system is 
described by a Hilbert space H, with a linear structure that plays a major role for physics. 
If the system can be split into two components, its state space splits into the tensor product 
of two Hilbert spaces: H = H1 ® Hz, each carrying a subset of observables. Because of 
the linearity, a generic state is not a tensor product of component states; that is, in general 
w 4 W1 ® yo. This is entanglement. Restricting the observables to those of a subsystem, 
say system 1, determines a quantum entropy over and above classical statistical entropy. 
This is measured by the von Neumann entropy S = —tr[p log p] of the density matrix 
p =try,|¥)(w|. Coarse graining is given by the restriction to the observables of a single 
subsystem. 

The conjecture presented in this chapter can then be extended to the quantum context. 
Consider a “sufficiently complex” quantum system.* Then: 


Conjecture Given a generic state evolving in time as y(f), there exist splits of the system 
into subsystems such that the von Neumann entropy is low at initial time and increases in 


time.* 


The point here is to avoid assuming a fixed tensorial structure of Ha priori. Instead, given 
a generic state, we can find a tensorial split of # which sees von Neumann entropy grow 
in time. 

This conjecture, in fact, is not hard to prove. A separable Hilbert space admits many 
discrete bases |n). Given any w € H, we can always choose a basis |n) where y = |1). 
Then we can consider two Hilbert spaces, H, and H2, with bases |k) and |m), and map 
their tensor product to H by identifying |k) @ |m) with the state |n) where (k,m) appear, 
say, in the n-th position of the Cantor ordering of the (n,m) couples ((1,1), (1,2), (2,1), 
(1,3), (2,2), 3,1), ,4)...). Then, y =|1) @|1) is a tensor state and has vanishing von Neu- 
mann entropy. On the other hand, recent results show that entanglement entropy generically 
evolves towards maximizing entropy of a fixed tensor split (see Deutsch et al. [6]). 

Therefore for any time evolution w(t) there is a split of the system into subsystems such 
that the initial state has zero entropy and then entropy grows. Growing and decreasing of 
(entanglement) entropy is an issue about how we split the universe into subsystems, not a 
feature of the overall state of things (on this, see Tegmark [7]). Notice that in quantum field 
theory there is no single natural tensor decomposition of the Fock space. 

Finally, let me get to general relativity. In all the examples above, I have considered 
non-relativistic systems where a notion of the single time variable is clearly defined. I 
have therefore discussed the direction of time, but not the choice of the time variable. 
In special relativity, there is a different time variable for each Lorentz frame. In general 


ies) 


This means: with a sufficient complex algebra of observables and a Hamiltonian which is suitably “ergodic” 
with respect to it. A quantum system is not determined uniquely by its Hilbert space, Hamiltonian and state. 
All separable Hilbert spaces are isomorphic, and the spectrum of the Hamiltonian, which is the only remaining 
invariant quantity, is not sufficient to characterize the system. 

4 And others for which entropy is low at final time. 
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relativity, the notion of time further breaks into related but distinct notions, such as proper 
time along worldliness, coordinate time, clock time, asymptotic time, cosmological time... 
Entropy increase becomes a far more subtle notion, especially if we take into account the 
possibility that thermal energy leaks to the degrees of freedom of the gravitational field 
and therefore macrostates can includes microstates with different spacetime geometries. In 
this context, a formulation of the second law of thermodynamics requires us to identify 
not only a direction for the time variable, but also the choice of the time variable itself 
in terms of which the law can hold [8]. In this context, a split of the whole system into 
subsystems is even more essential than in the non-relativistic case, in order to understand 
thermodynamics [8]. The observations made in this chapter therefore apply naturally to the 
non-relativistic case. 

The perspectival origin of many aspects of our physical world has been recently empha- 
sized by some of the philosophers most sensible to modern physics [9, 10]. I believe that 
the arrow of time is not going to escape the same fate. 

The reason for the entropic peculiarity of the past should not be sought in the cosmos at 
large. The place to look for them is in the split, and therefore in the macroscopic observ- 
ables that are relevant to us. Time asymmetry, and therefore “time flow’, might be features 
of a subsystem to which we belong, features needed for information gathering creatures 
like us to exist, not features of the universe at large. 
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15 
Relational Quantum Cosmology 


FRANCESCA VIDOTTO 


15.1 Conceptual Problems in Quantum Cosmology 


The application of quantum theory to cosmology raises a number of conceptual questions, 
such as the role of the quantum-mechanical notion of “observer” or the absence of a time 
variable in the Wheeler—De Witt equation. I point out that a relational formulation of quan- 
tum mechanics, and more in general the observation that evolution is always relational, 
provides a coherent solution to this tangle of problems. 

A number of confusing issues appear when we try to apply quantum mechanics to cos- 
mology. Quantum mechanics, for instance, is generally formulated in terms of an observer 
making measurements on a system. In a laboratory experiment, it is easy to identify the 
system and the observer: but what is the observer in quantum cosmology? Is it part of the 
same universe described by the cosmological theory, or should we think of it as external 
to the universe? Furthermore, the basic quantum dynamical equation describing a sys- 
tem including gravity is not the Schrddinger equation, but rather the Wheeler-DeWitt 
equation, which has no time parameter: is this related to the absence of an observer 
external to the universe? How do we describe the quantum dynamics of the universe with- 
out a time variable in the dynamical equation and without an observer external to the 
universe? 

I suggest that clarity on these issues can be obtained by simply recognizing the rela- 
tional nature of quantum mechanics and more in general the relational nature of physical 
evolution. In the context of quantum theory, this nature is emphasized by the so-called 
relational interpretation of quantum mechanics (Rovelli, 1996). The relational nature of 
evolution, on the other hand, has been pointed out by the partial-observable formulation of 
general covariant dynamics (Rovelli, 2002). 

A central observation is that there is a common confusion between two different 
meanings of the expression “cosmology”. This is discussed below, in Section 15.1.1. 
A considerable amount of the difficulties mentioned above stems, I argue, from this 
confusion. The discussion clarifies on the role of the quantum mechanical observer in 
cosmology, which is considered in Section 15.1.2. In Section 15.2, I briefly describe 
the relational interpretation of quantum mechanics and some related aspects of quan- 
tum theory. In Section 15.3.2, I discuss the timeless aspect of the Wheeler—-DeWitt 
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equation. In Section 15.3.5, I show how this perspective on quantum cosmology can be 
taken as an effective conceptual structure in the application of loop quantum gravity to 
cosmology. 


15.1.1 Cosmology is not About Everything 


The expression “cosmology” is utilized to denote two very different notions. The first is 
the subject of exploration of the cosmologists. The second is the “totality of things”. These 
are two very different meanings. 

To clarify why, consider a common physical pendulum. Its dynamics is described by 
the equation g = —wgq, and we know well how to deal with the corresponding classical 
and quantum theory. Question: does this equation describe “everything” about the physical 
pendulum? The answer is obviously negative, because the pendulum has a complicated 
material structure, ultimately made by fast moving elections, quarks, and whatever, not to 
mention the innumerable bacteria most presumably living on its surface and their rich bio- 
chemistry... The point is that the harmonic oscillator equation certainly does not describe 
the totality of the physical events on the real pendulum: it describes the behavior of one 
dynamical variable, neglecting everything else happening at smaller scales. 

In a very similar manner, cosmology — in the sense of “what cosmologists actually 
study” — describes a number of large scale degrees of freedom in the universe. This num- 
ber may be relatively large: it may include all the measured CMB modes, or the observed 
large scale structures. But it remains immensely smaller than the total number of degrees 
of freedom of the real universe: the details of you reading now these words do not appear 
in any of the equations written by cosmologists. In strict sense, “cosmology”, defined as 
the object of study of the cosmologists, denote the large scale degrees of freedom of the 
universe. The fact that many shorter scale degrees of freedom are neglected is no different 
from what happens in any other science: a biologist studying a cat is not concerned with 
the forces binding the quarks in the nucleus of an atom in the cat’s nose. 

There is however a different utilization of the term “cosmology” that one may find in 
some physics articles: sometimes it is used to denote a hypothetical science dealing with 
“the totality of all degrees of freedom of Nature”. For the reason explained above, the two 
meanings of “cosmology” are to be kept clearly distinct, and much of the conceptual con- 
fusion raised by quantum cosmology stems from confusing these two different meanings 
of the term. For clarity, I use here two distinct terms: I call “cosmology” the science of the 
large scale degrees of freedom of the Universe, namely the subject matter of the cosmolo- 
gists. We can call “totology” (from the latin “totos”, meaning “all’’) the science — if it exists 
— of all degrees of freedom existing in reality. 

The important point is that cosmology and totology are two different sciences. If we are 
interested in the quantum dynamics of the scale factor, or the emergence of the cosmo- 
logical structures from quantum fluctuations of the vacuum, or in the quantum nature of 
the Big Bang, or the absence of a time variable in the Wheeler-DeWitt equation, we are 
dealing with specific issues in cosmology, not in totology. 
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15.1.2 The Observer in Cosmology 


The considerations above indicate that a notion of “observer” is viable in cosmology. In 
cosmology, the “observer” is formed by ourselves, our telescopes, the measurement appa- 
ratus on our spacecraft, and so on. The “system” is formed by the large scale degrees of 
freedom of the universe. The two are clearly dynamically distinct. The observer is not “part 
of the system’’, in the dynamical sense. 

Of course the observer is “inside” the system in a spacial sense, because the scale factor 
describes the dimension of a universe within which the observer is situated. But this is 
no more disturbing than the fact that a scientist studying the large scale structure of the 
magnetic field of the Earth is situated within this same magnetic field. Being in the same 
region of space does not imply being in the same degrees of freedom. 

The notion of a dynamically external observer may be problematic in totology, but it is 
not so in conventional cosmology, and therefore there is no reason for it to be problematic 
in quantum cosmology. 

Actually, there is an aspect of the conventional presentation of quantum theory which 
becomes problematic: the idea that a system has an intrinsic physical “state” which jumps 
abruptly during a measurement. This aspect of quantum theory becomes implausible in 
cosmology, because the idea that when we look at the stars the entire universe could jump 
from one state to another is not very palatable. But whether or not we interpret this jump as 
a physical event happening to the state depends on the way we interpret the quantum the- 
ory. This is why the application of quantum theory to cosmology bears on the issue of the 
interpretation of quantum mechanics. There are some interpretations of quantum mechan- 
ics that become implausible when utilized for quantum cosmology. But not all of them. 
Below, I describe an interpretation which is particularly suitable for cosmology and which 
does not demand implausible assumptions such as the idea that our measurement could 
change the entire intrinsic state of the universe, as demanded by textbook Copenhagen 
interpretation. As we shall see, in the relational interpretation of quantum mechanics the 
“quantum state” is not interpreted as an intrinsic property of a system, but only as the infor- 
mation one system has about another. There is nothing implausible if this changes abruptly 
in a measurement, even when the observed system is formed by the large scale degrees of 
freedom of the universe. 


15.2 Relational Quantum Mechanics 


The relational interpretation of quantum mechanics was introduced in Rovelli (1996) and 
has attracted the interest of philosophers such as Michel Bitbol, Bas van Frassen and Mauro 
Dorato (see Dorato (2013) and references therein). 

The point of departure of the relational interpretation is that the theory is about quantum 
events rather than about the wave function or the quantum state. The distinction can be 
traced to the very beginning of the history of quantum theory: Heisenberg’s key idea was 
to replace the notion of an electron continuously existing in space with a lighter ontology, 
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the one given just by discrete tables of numbers. The electron, in Heisenberg’s vision, can 
be thought of as “jumping” from one interaction to another.' In contrast, one year later, 
Schrédinger was able to reproduce the technical results of Heisenberg and his collaborators 
using a wave in space. Schrddinger’s wave evolved into the modern notion of quantum 
state. What is quantum mechanics about: an evolving quantum state, as in Schrédinger; or 
a discrete sequence of quantum events that materialise when systems interact? 

Relational quantum mechanics is an interpretation of quantum mechanics based on the 
second option, namely on Heisenberg’s original intuition. The advantage is that the quan- 
tum state is now interpreted as a mere theoretical booking device for the information about 
a system S' that a given system O might have gathered in the course of its past interactions 
with S. Therefore, the quantum state of S is not intrinsic to S: it is the state of S relative 
to O. It describes, in a sense, the information that O may have about S. No surprise if it 
jumps abruptly at a new interaction, because at each new interaction O can gather new 
information about S. 

Thus, if we adopt this reading of quantum theory, there is no meaning in “the wave 
function of the entire universe’, or “the quantum state of everything”, because these notions 
are extraneous to the relational interpretation. A quantum state, or a wave function,” can 
only refer to the interactions between two interacting subsystems of the universe. It has no 
more reality than the distributions of classical statistical mechanics: tools for computing. 
What is real is not the quantum state: what is real are the individual quantum events. For 
all these reasons, the relational interpretation of quantum mechanics is particularly suitable 
for quantum cosmology. 

Of course there is a price to pay for this remarkable simplification, as always the case 
with quantum theory. The price to pay here is that we are forced to recognise that the 
fact itself that a quantum event has happened, or not, must be interpreted as relative to 
a given system. Quantum events cannot be considered absolute; their existence is rela- 
tive to the physical systems involved in an interaction. Two interacting systems realise 
a quantum event relative to one another, but not necessarily relative to a third system. 
This caveat, discussed in detail in Rovelli’s original article and in the numerous philoso- 
pher’s articles on the relational interpretation, is necessary to account for interference and 
to accommodate the equality of all physical systems. In fact, this is the metaphysical core 
of the relational interpretation, where a naive realism is traded for mature realism, able to 
account for Heisenberg’s lighter ontology. 


! Heisenberg gives a telling story about how he got the idea. He was walking in a park in Copenhagen at night. 
All was dark except for a few islands of light under street lamps. He saw a man walking under one of those, 
then disappearing in the dark, then appearing again under the next lamp. Of course, he thought, man is big and 
heavy and does not “really” disappear: we can reconstruct his path through the dark. But what about a small 
particle? Maybe what quantum theory is telling us is precisely that we cannot use the same intuitions for small 
particles. There is no classical path between their appearance here and their appearance there. Particles are 
objects that manifest themselves only when there is an interaction, and we are not allowed to fill up the gap in 
between. The ontology that Heisenberg proposes does not increase on the ontology of classical mechanics: it 
reduces it. It is Jess, not more. Heisenberg removes excess baggage from classical ontology and is left with a 
minimum necessary to describe the world. 

The “wave function” is the representation of the quantum state by means of a function on the spectrum of a 
complete commuting family of observables; for instance, a wave on configuration space. 
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In cosmology, however, it is a small price to pay: the theory itself guarantees that as long 
as the quantum effect in the interactions between quantum systems can be disregarded, 
different observing systems give the same description of an observed system and we are 
not concerned about this lighter ontology. This is definitely the case in cosmology.* 

Under the relational reading of quantum theory, the best description of reality we can 
give is the way things affect one another. Things are manifest in interactions (quantum 
relationalism). Quantum theory describes reality in terms of facts appearing in interac- 
tions, and relative to the systems that are interacting. Cosmology describes the dynamics 
— possibly including the quantum dynamics — of the large scale degrees of freedom of the 
universe and the way these are observed and measured by our instruments. 

Before closing this section on the relational interpretation, I discuss below two important 
notions on which this interpretation is based: discreteness and quantum information. I give 
also, below, a discussion on the role of the wave function of quantum mechanics, and on 
its common overestimation. 


15.2.1 Discreteness 


Quantum mechanics is largely a discovery of a very peculiar form of discreteness in 
Nature. Its very name refers to the existence of peculiar discrete units: the “quanta”. 
Many current interpretations of quantum mechanics underemphasise this discreteness at 
the core of quantum physics, and many current discussions on quantum theory neglect 
it entirely. Historically, discreteness played a pivotal role in the discovery of the theory: 


e Planck (1900): finite size packets of energy E = hv 

e Einstein (1905): discrete particles of light 

e Bohr (1912): discrete energy levels in atomic orbits 

e Heisenberg (1925): tables of numbers 

e Schrédinger (1926): discrete stationary waves 

e Dirac (1930): state space and operators with possibly discrete spectra 
e Spin: discrete values of angular momentum 

e QFT: particles as discrete quanta of a field. 


ies) 


In spite of this lighter ontology, this interpretation of quantum theory takes fully the side of realism, in the 
sense that it assumes that the universe exists independently from any conscious observer actually observing 
it. Consciousness, mind, humans, or animals, have no special role. Nor has any role any particular structure 
of the world (macroscopic systems, records, information gathering devices...). Rather, the interpretation is 
democratic: the world is made of physical systems, which are on equal ground and interact with one another. 
Facts are realized in interactions and the theory describes the probabilities of the outcome of future interaction, 
given past ones. But this is a realism in the weak sense of Heisenberg. In the moment of the interaction there 
is a real fact. But this fact exists relatively to the interacting systems, not in the absolute. In the same manner, 
two objects have a well defined velocity with respect to one another, but we cannot say that a single object 
has an absolute velocity by itself, unless we implicitly refer to some other reference object. Importantly, the 
structure of quantum theory indicates that in this description of reality the manner in which we split the world 
in subsystems is largely arbitrary. This freedom is guaranteed by the tensorial structure of QM, which grants the 
arbitrariness of the positioning of the boundary between systems which has been studied by Wigner: the magic 
of the quantum mechanics formalism is that we can split up the Hilbert space into pieces, and the formalism 
keeps its consistency. 
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Figure 15.1 In the shaded region R there is an infinite number of classical states, but only a finite 
number of quantum ones. 


As all these examples indicate, the scale of all these examples of discreteness is always 
set by fi. But what is actually discrete, in general, in quantum mechanics? It is easy to 
answer by noticing that / has the dimensions of an action and that the (Liouville) volume 
of the phase space of any system has always the dimension of an action (per degree of 
freedom). The constant fi sets a unit of volume in the phase space of any system. The 
physical discovery at the basis of quantum theory is the impossibility of pinpointing the 
classical state of a system with a precision superior to such a unit volume. 

If we have made some measurements on a system, with a given accuracy, and, con- 
sequently, we know that the system is in a certain region R of its phase space, then the 
quantum theory tells us that there is only a finite number of possible (orthogonal, namely 
distinguishable) states in which the system can be, given by 


Vol R 
N= olume(. ) 


; (15.1) 


Let us consider a simple example: say that measuring the energy E of a harmonic oscil- 
lator we learn that E € [E), E2]. This determines a region R of phase space, with volume 
V= JretE, 2] dp dq = 2x EE (see Figure 15.1). There is an infinite number of possible 
classical states, and an infinite number of possible values that the energy can actually have. 
In the quantum theory — that is, in Nature — the energy cannot take all these values, but 
only a finite number, which is J = Ey = x. This is the maximum number of orthogonal 
states compatible with the previous measurement. 

The example shows that a region of phase space is the specification of a certain amount 
of information we have about a system. This is general: quantum mechanics teaches us 
that in every finite region of phase space we can only accommodate a finite number of 
orthogonal states, namely that there is a finite information that we can extract from any 


finite region of phase space. 
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The same is true in quantum field theory. The quanta of a field are discrete particles 
(Dirac) and this is precisely a manifestation of this same discreteness; in particular, this is 
manifested in the discreteness of the spectrum of the energy of each mode. 

Discreteness, in this sense, is the defining property of all quantum systems. The 
discreteness scale is set by fi, an action, or phase space volume. 


15.2.2 Information 


The notion of information useful in quantum theory is Shannon’s relative information, 
which is defined as follows. Given two systems, such that we can find the first system in 
Na States and the second in Np, we say that the two have information about one another if, 
because of physical constraint, we can find the combined state in a number Nap of states 
which is smaller than N, x Np. The relative information is then defined by 


I = logy (Na x Np) — log, (Nap) (15.2) 


The utility of this definition of information is that there is nothing mental or subjective 
about it: it is simply a measure that physics establishes between two degrees of freedom. 
For instance, as long as my pencil is not broken, each extreme of the pencil has information 
about the other, because the two can be in a smaller number of places than two separated 
objects. Knowledge of the position of one give some information about the position of the 
other. 

The existence of this correlation is a measurable property of the combined system. If 
we have information about a system, we can make predictions about the outcome of future 
interactions with it. We can call “relevant” information that portion of the information that 
we have about a system which is not redundant in view of such predictions. In the relational 
interpretation, physics is the theory of the relevant relative information that systems can 
have about one another. In Rovelli (1996), two basic postulates were proposed, meant to 
capture the physical content of quantum theory: 


Postulate I There is a maximum amount of relevant information that can be extracted 
from a system. 


Postulate II It is always possible to acquire new relevant information about a system. 


Remarkably, the entirety of the quantum mechanical formalism (Hilbert spaces, self adjoint 
operators, eigenvalues and eigenstates, projection postulate...) can be recovered on the 
basis of these two postulates, plus a few other more technical assumptions. The effort of 
understanding the physical meaning of these additional postulates, and to provide a mathe- 
matically rigorous reconstruction theorem has been developed by a number of authors and 
is still in progress (Grinbaum, 2003; Hoehn, 2014). 
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A given system S can be characterized by a set of variables. These take values in a 
set (the spectrum of the operators representing them, possibly discrete). In the course of 
an interaction with a second system O, the effect of the interaction on O depends on the 
value of one of these variables. We can express this by saying that the system O has then 
the information that this variable of the system S has the given value. This is one of the 
elementary quantum events that quantum theory is concerned with. The first postulate cap- 
tures an essential aspect of quantum theory: the impossibility of associating a single point 
in phase space to the state of a system: to do so, we would need an infinite amount of infor- 
mation. The maximum localization in phase space is limited by fi, which determines the 
minimal physical cell in phase space. The information about the state of a system is there- 
fore limited. The second postulate distinguishes quantum systems from classically discrete 
systems: even if finite, information is always incomplete, in the sense that new informa- 
tion can be gathered, at the price of making previous information irrelevant for predicting 
the future interactions: by measuring the L;, component of angular momentum we destroy 
information we previously had on the L, component. 


15.2.3 The Role of the Wave Function 


If the real entities in quantum theory are discrete quantum events, what is the quantum 
state, or the wave function, which we commonly associate with an evolving system on 
many applications of quantum theory? 

A good way to make sense of a quantity in a theory is to relate it to the corresponding 
quantity in an approximate theory, which we understand better for historical reasons. We 
can therefore investigate the meaning of the quantum mechanical wave function by study- 
ing what it becomes in the semiclassical limit. As is well known, if w(x, 1?) is the wave 
function of a Newtonian particle: 


(x,t) ~ en 50 (15.3) 


where S(x, f) is the Hamilton function, the Schrédinger equation becomes in the classical 


limit: 
a re 2 
ee th= “saa t) + V(x) W(x, t) (15.4) 
S = a S V 15.5 
= ry = 5 ae (x,t) + VQ) (15.5) 


which is the Hamilton—Jacobi equation. 

The Hamilton function is a theoretical device, not something existing for real in space 
and time. What exists for real in space and time is the particle, not its Hamilton function. 
I am not aware of anybody suggesting to endow the Hamilton function with a realistic 
interpretation. 
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So, why therefore would anybody think of doing so with its corresponding quantum 
object, the wave function? An interpretation of quantum theory that considers the wave 
function a real object sounds absurd: it is assigning a reality to a calculation tool. If I say: 
“Tomorrow we can go hiking, or we can go swimming”, I am not making a statement 
implying that tomorrow the two things will both happen: I am only expressing my igno- 
rance, my uncertainty, my lack of knowledge, my lack of information, about tomorrow. 
Perhaps tomorrow neither programme would realise, but not both. The wave function of a 
particle expresses probabilities, and these are related to our lack of knowledge. It does not 
mean that the wave function becomes a real entity spread in space, or, worse, in configu- 
ration space. The wave function (the state) is like the Hamilton function: a computational 
device, not a real object (Durr et al., 1995). 

In science, abstract concepts have sometimes been recognized for real entities. But often 
the opposite has happened: a misleading attitude has come from realism barking at the 
wrong tree: examples are the crystal spheres, the caloric, the ether... A realist should not be 
realist about the quantum state, if she wants to avoid the unpalatable alternative between 
the Scylla of the quantum collapse or the Charybdis of the branching many worlds. We 
can make sense of quantum theory with a softer realism, rather than with an inflated one. 
The wave function is about the relative information that systems have about each other: it 
is about information or lack of information, it is about what we can expect about the next 
real quantum event. 


15.2.4 Relevance for Quantum Cosmology 


As argued in the previous section, cosmology, in the conventional sense, is the study of 
the dynamics of these large scale degrees of freedom. It is not about everything, it is about 
a relatively small number of degrees of freedom. It is essentially based on an expansion 
in modes and neglects short wavelength modes, where “short” includes modes of millions 
of light years. It describes the way the large modes affect all the (classical) rest, including 
us. The dynamics of these degrees of freedom is mostly classical, but it can be influenced 
by quantum mechanics, which introduces discreteness and lack of determinism, in some 
regimes such as the early universe. This quantum aspect of the dynamics of the universe 
at large can be taken into account by standard quantum mechanics. The “observer” in 
this case is simply formed by ourselves and our instruments, which are not part of the 
large scale dynamics. The quantum state of the system represents the information we have 
gathered so far about the large scale degrees of freedom. As usual in quantum theory, we 
can often “guess” aspects of these states, completing our largely incomplete observations 
about them. 

As far as the other interpretation of the term “cosmology” is concerned, namely as totol- 
ogy, in the light of the relational interpretation of quantum theory, it is perhaps questionable 
whether a coherent “quantum totology” makes any sense at all. If we define a system as the 
totality of anything existent, it is not clear what is the meaning of studying how this system 
appears to an observer, since by definition there is no observer. The problem is open, but 
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it is not much related to the concerns of the real-life cosmologists, or to questions such as 
what happened at the big bang. 

We can then use standard quantum theory, for instance in the form of a Hilbert space for 
those large degrees of freedom, observables that describe the physical interaction between 
the large degrees of freedom and our telescopes. For instance we can measure the temper- 
ature of the CMB. This can be done with the usual conceptual tools of quantum theory. 
The system does not include the observer. The observer in cosmology is indeed the major- 
ity of stuff in the universe: all the degrees of freedom of the universe except for the large 
scale one. This definition of cosmology eliminates any problem about the observer. A mea- 
surement does not need to “affect the large scale universe’. It only affects our information 
about it. 


15.3 Quantum Gravity and Quantum Cosmology 


As best as we know, gravity is described by general relativity. The peculiar symmetries of 
general relativity add specific conceptual issues to the formulation of quantum cosmology. 
Among these is the fact that instead of a Schrédinger equation, evolving in a time variable 
t, we have a Wheeler—DeWitt equation, without any explicit time variable. 

It is important to distinguish different issues. The discussion of the two previous sections 
would not change if gravity was described by Newton’s theory and our main quantum 
dynamical equation was a Schrédinger equation. The distinction between cosmology and 
totology would not change. Contrary to what is often stated, the lack of a time variable 
in the Wheeler—DeWitt equation has no direct relation with the existence or the absence 
of an external quantum observer, because the problem given a quantum description of the 
totality of things would be the same also if gravity was Newtonian. 

The additional, specific problem raised by general relativity is how to describe evolution 
in the presence of general covariance. The solution, on the other hand, is well known and is 
very similar to the solution of the problem of the observer discussed above: quantum theory 
describes the state of a system relative to another — arbitrary — system. General relativity 
describes the evolution of some observables relative to other — arbitrary — observables. 

This solution is recalled below. 


15.3.1 The Relational Nature of General Relativity 


The celebrated idea of Einstein in 1915 is that spacetime is the gravitational field. This 
implies that spacetime is a dynamical field. General relativity does not imply that the grav- 
itational field is particularly different from other fields. It implies that all physical fields 
do not live in spacetime: rather, the universe is made of several fields, interacting with one 
another. One of these is the gravitational field. The description of some specific aspect of a 
configuration of this field is what we call geometry. 

In the region of the universe where we live, a good approximation is obtained by neglect- 
ing both the dynamics of the gravitational field and its local curvature, and using the 
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gravitational field as a fixed background with respect to which we can define acceleration 
and write Newton’s second law. 

Localization is relative, to other dynamical objects, including the gravitational field. This 
is also true for temporal localization. In general relativity there is no fixed background 
structure, nor a preferred time variable with respect to which events are localized in time. 
In Newtonian physics there is a time along which things happen; in general relativity there 
is no preferred time variable. Physical variables evolve with respect to one another. For 
instance, if we keep a clock at a fixed altitude and we throw a second clock upward so that 
it raises and then falls back next to the first clock, the readings of the two clocks will then 
differ. Given sufficient data, general relativity allows us to compute the value of the first ry 
as a function of the second f2, or vice versa. None of the two variables ft; or fz 1s a more 
legitimate “time” than the other. 

This observation can be formalised in terms of the notion of partial observable (Rovelli, 
2013), which provides a clean way to deal with the peculiar gauge structure of general 
relativity. 

In the example above, the two variables ¢; and f2, representing the reading of the two 
clocks, are partial observables. Both quantities can be measured, but none of the two can 
be predicted, of course, because we do not know “when” an observation is made. But the 
value of each can be predicted once the other is known. The theory predicts (with sufficient 
information) the relation between them. 

In the Hamiltonian formulation of the theory, the dynamics of general relativity is gen- 
erated by constraints, as a direct consequence of the absence of background. The solution 
of those constraints codes the evolution of the system, without external time with respect 
to which evolution can be described (Rovelli, 1991). Partial observables are functions on 
the extended phase space where constraints are defined. On the constraint surface, the 
constraints generate orbits that determine relations between partial observables. These 
relations express the classical dynamical content of the theory. 

The prototypical example of a partial observable which can be measured but not 
predicted is the conventional time variable of Newtonian physics: a quantity that we rou- 
tinely determine (looking at a clock) but we can not predict from the dynamics of the 
system. 

The physical phase space is the space of these orbits. A point of phase space cannot 
be interpreted as the characterization of the state of the system at a given time, because 
the theory has no notion of time. Rather, it can be interpreted as a way for designing 
a full solution of the equations of motion. But the dynamical information is not just in 
the physical phase space: it is in the relation between partial observables that each orbit 
determines. 

In the quantum theory, the strict functional dependence between partial observables 
determined by the classical dynamics is replaced by transition amplitudes and transition 
probabilities. Thus, in a general covariant theory, physical observations are given by the 
transition amplitudes between eigenstates of partial observables. Formally, these are given 
by the matrix elements of the projector on the solution of the Wheeler-DeWitt operator 
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between these states, which are defined on the same extended Hilbert space on which the 
Wheeler—De Witt operator is defined. 


15.3.2 Time in Quantum Cosmology 


In cosmology it is often convenient to choose one of the variables and treat it as a “clock 
variable”, that is, an independent variable with respect to which we study the evolution 
of the others. The choice is dictated by convenience, and has no fundamental significance 
whatsoever. For instance, there is no particular reason for choosing an independent vari- 
able that evolves monotonically along the orbits, or that defines a unitary evolution in the 
quantum theory. 

In the case of a homogeneous and isotropic cosmology, where only one degree of free- 
dom is considered, at least a second one is needed, to be used as a clock. A common choice 
is a massless scalar field. A more ingenious strategy can be implemented with enough 
degrees of freedom: for instance in the Bianchi I cosmology where the three spatial direc- 
tion can evolve independently, one spatial direction can be taken to play the role of time. 
One should be careful to deal with regions where the chosen time fails to be monotonic. 
For instance the scale factor, that is implicitly taken as a clock in many models of quantum 
cosmology, is not monotonic if there is a recollapse. 

Consider the study of the quantum mechanics of the scale factor a plus a single other 
degree of freedom, say a scalar field ¢@ representing the average matter energy density. The 
two variables @ and a are partial observables. Predictions are extracted from their relative 
evolution of realizing Einstein’s relationalism (Ashtekar, 2007). 

In loop quantum cosmology, in particular, the dynamics studied include effects of the 
fundamental spacetime discreteness revealed by loop quantum gravity, using the technique 
of loop quantization. Among the results of the theory are the generic resolution of curvature 
singularities and the indication of the existence of a bounce replacing the initial singularity: 
a classical contracting solution of Einstein’s equations can be connected to an expanding 
one via quantum tunneling. The bounce is a consequence of the Heisenberg relations for 
gravity, in the same way in which for an atomic nucleus those prevent the electrons from 
falling in (Ashtekar et al., 2006; Bojowald, 2001). In the easiest case of a FLRW universe 
with no curvature, the effective equations provide a simple modification of the Friedmann 


equation that is: 
+\2 
8G 
(<) = o(1 ey, (15.6) 
a 3 Pc 


where (¢ is the critical density at which the bounce is expected and that can be computed to 
correspond roughly speaking to the Planck density. The effects of the bounce on standard 
cosmological observables, such as CMB fluctuations, have been lengthily studied, see for 
instance (Ashtekar and Barrau, 2015). 

These results are of course tentative and wait for an empirical confirmation, but the 
standard difficulty regarding time evolution does not plague them. 
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15.3.3 Covariant Loop Quantum Gravity 


If we move to the quantum description of a small number of degrees of freedom, as in the 
last section, to a full quantum theory of gravity, which is ultimately needed in quantum 
cosmology, some interesting structures appear. 

The main point is that in the absence of time we have to modify the notion of “physical 
system” used in Section 15.2. The reason is that the notion of quantum system implies a 
permanence in time which loses meaning in the fully covariant theory: how do we identify 
the “same system” at different times in a covariant field theory? 

The solution is to restrict to local processes. The amplitudes of quantum gravity can be 
associated to finite spacetime regions, and the states of quantum gravity to the boundary of 
these regions. In fact, in quantum gravity we may even identify the notion of a spacetime 
region with the notion of process, for which we can compute transition amplitudes. The 
associated transition amplitudes depend on the eigenstates of partial observables that we 
can identify with spatial regions bounding the spacetime region of the process. In particu- 
lar, loop quantum gravity gives a mathematically precise definition of the state of space, the 
boundary observables, and the amplitude of the process, in this framework. The possibility 
of this “boundary” formalism (Oeckl, 2003) stems from a surprising convergence between 
general relativity and quantum theory, which we have implicitly pointed out above. 

We can call “Einstein’s relationalism” the fact that in general relativity localization of an 
event is relative to other events. We can call “quantum relationalism” the fact that quantum 
theory is about the manner a system affects another system. In Bohr’s quantum theory, the 
attention was always between the quantum system and the classical world, but we have 
seen that relational quantum theory allows us to democratize this split and describe the 
influence of any system on any other. These two relationalisms, however, appear to talk to 
one another, because of the locality of all interactions (Vidotto, 2013). 

Indeed, one of the main discoveries in modern physics is locality: interactions at dis- 
tance of Newton’s kind do not seem to be part of our world. In the standard model things 
interact only when they “touch”: all interactions are local. But this means that objects in 
interactions should be in the same place: interaction requires localization and localization 
requires interaction. To be in interaction corresponds to being adjacent in spacetime and 
vice versa: the two reduce to one another. 


Quantum relationalism — Einstein’s relationalism 
Systems interact with other systems <—> Systems are located with regard to 
other systems 
Interaction = Localization — Localization = Interaction 


Bringing the two perspectives together, we get to the boundary formulation of quantum 
gravity: the theory describes processes and their interactions. The manner a process affects 
another is described by the Hilbert state associated to its boundary. The probabilities of 
one or another outcome are given by the transition amplitudes associated to the bulk, and 
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Spacetime region (4) 


Figure 15.2 Boundary values of the gravitational field = geometry of box surface = distance and 
time separation of measurements. 


are obtained from the matrix elements of the projector on the solutions of the Wheeler—De 
Witt equation. 

Let us make this more concrete. Consider a process such as the scattering of some par- 
ticles at CERN. If we want to take into account the gravitational field, we need to include 
it as part of the system. In doing quantum gravity, the gravitational field (or spacetime) is 
part of the system. Distance and time measurements are field measurements like the others 
in general relativity: they are part of the boundary data of the problem. 

Thinking in terms of functional integrals, we have to sum over all possible histories, but 
also all possible geometries associated to a given finite spacetime region. 

In the computation of a transition amplitude, we need to give the boundary data of the 
process that are for instance the position of a particle at an initial and a final time. We use 
rods and clocks to define them. But those measure geometrical information that is just the 
value of the gravitational field. Everything we have to give is the value of the fields on the 
boundary. This includes the gravitational fields from which we can say how much time has 
passed and the distance between the initial and the final point. Geometrical and temporal 
data are encoded in the boundary state (see Figure 15.2), because this is also the state of 
the gravitational field, which is the state of spacetime. 

This clarifies that in quantum gravity a process is a spacetime region. 

Now, we have seen that in relational quantum mechanics we need systems in interaction. 
What defines the system and when is it interacting? For spacetime, a process is simply a 
region of spacetime. Spacetime is a quantum mechanical process once we do quantum 
gravity. Vice versa, this now helps us to understand how to do quantum gravity. 

Notice that from this perspective quantum gravitational processes are defined locally, 
without any need to invoke asymptotic regions. Summarizing: 


Quantum dynamics of spacetime 
Processes — Spacetime regions 
States — Boundaries = spacial regions 
Probability — Transition amplitudes 
Discreteness —> Quanta of space 
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15.3.4 Discreteness in Quantum Gravity 


In Section 15.2.1, I have discussed how fi gives us a unit of action in phase space, and a 
conversion factor between action and information. In gravity the phase space is the one 
of possible four-dimensional (4D) geometries, and there is the Newton constant G, which 
transforms regions of phase space in lengths. What kind of discreteness does this imply? 

The answer is the well known Planck length, originally pointed out by Bronstein 
while debating a famous argument on field’s measurability by Landau, in the case of the 
gravitational field (Rovelli and Vidotto, 2014). 

The argument is simple: in order to check what happens in a small region of spacetime, 
we need a test particle. The smaller the region the more energetic the particle should be, 
until energy curves spacetime to form a black hole, whose horizon beyond which nothing 
can be seen is larger than the original region we wanted to prove. Because of this, it is 
not possible to probe scales smaller than the Planck length ¢p,. This is the core of core 
quantum gravity: the discovery that there is a minimal length. 

These handwaving semiclassical arguments can be made rigorous in the loop theory 
studying the phase space of general relativity and the corresponding operators. Geometrical 
quantities, such as area, volumes and angles, are functions of the gravitational field that are 
promoted to operators. Their discrete spectra describe a spacetime that is granular in the 
same sense in which the electric field is made of photons. For instance, the spectrum of the 
area can be computed and it results in being discrete (Rovelli and Smolin, 1995): 


IN 
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where j is a half integer, similarly to what happens for the angular momentum. This has a 
minimal eigenvalue: a minimal value for the area. 

Loop quantum gravity describes how these quanta of spacetime interact with one 
another. The notion of geometry emerges only from the semiclassical picture of these inter- 
actions. Formally, the theory is defined as follows. Every quantum field theory can be given 
in terms of a triple (H, A, W): respectively a Hilbert space where the states live, an algebra 
of operators, and the dynamics defined in the covariant theory by a transition amplitude. 

The interactions with the field manifest the discreteness of quantum mechanics: the fun- 
damental discreteness appears in the presence of particles, that are just the quanta of a 
field, and in the spectrum of the energy of each mode of the field. The same structure 
applies to loop quantum gravity: states, operators and transition amplitudes can be properly 
defined (Rovelli and Vidotto, 2014) and there is a fundamental discreteness: the granular- 
ity of spacetime, yielded by the discreteness of the spectrum of geometrical operators. The 
geometry is quantized: eigenvalues are discrete and operators do not commute. Nodes carry 
discrete quanta of volume (quanta of space) and the links discrete quanta of area. Area and 
volume form a complete set of commuting observables and have discrete spectra. 

States in loop quantum gravity are associated to graphs characterized by N nodes and 
L links. They can be thought as analogous to the N-particle states of standard quantum 
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(a) Spinnetwork (b) Spinfoam 
boundary graph 


link 


Figure 15.3 A spinnetwork provides a representation of the net of interactions between quanta of 
space building up the weave of space. A spinfoam represents the evolution of a spinnetwork. Links 
evolving create the faces of the foam. Vertices can create or annihilate nodes of the spinnetwork. 


field theory, but with some further extra information given by the links, that turn out to be 
adjacency relations coding which “quanta of spacetime” are interacting with one another. 
These quanta are spacetime, they do not live in spacetime. The graphs are colored with 
quantum numbers, i.e. spins, forming the mathematical object called “spinnetwork” (Pen- 
rose, 1971). Penrose’s “spin-geometry” theorem connects the graph Hilbert space with the 
description of the geometry of a cellular decomposition of spacetime (see Figure 15.3). 

Notice that the full Hilbert space of the theory is formally defined in the limit of an 
infinite graph, but the physical theory is captured by a finite graph in the same way in which 
the Fock space is truncated to N particles. The truncation to a given finite graph captures 
the relevant degrees of freedom of the state we are interested to describe, disregarding those 
that need a “larger” graph to be defined. 

A transition amplitude that represents a history of the geometry, in terms of graph states, 
becomes a history of the boundary graphs, or a “spinfoam”’. In a spinfoam quanta/nodes 
and links/relations get transformed into a new configuration by the action of interaction 
vertex in the bulk. A link spans a face through its history. This is the way of picturing a 
history of the quanta when these quanta make up spacetime themselves. This yields an 
ontological unification where all that exists are covariant quantum fields (Vidotto, 2014). 


15.3.5 Spinfoam Cosmology 


On a compact space we can expand the dynamical fields in discrete modes. The truncating 
of the theory to a finite number of modes defines an approximation to the full theory. This 
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is neither a large scale approximation nor a short scale approximation, because the total 
space can still be very large or very small, as its scale is determined by the lowest modes 
of the gravitational field. Rather, the approximation is in the ratio between the largest and 
the smallest relevant wavelengths considered. 

The graph expansion of the spin-network formulation of loop quantum gravity can be 
put in correspondence with this mode expansion of the fields on a compact space (Rov- 
elli and Vidotto, 2008). A truncation on a fixed graph corresponds then to a truncation in 
the mode of expansion. The truncation provides a natural cut off of the infinite degrees of 
freedom of general relativity down to a finite number. Choosing a graph, we disregard the 
higher modes of this expansion. The truncation defines an approximation viable for grav- 
itational phenomena where the ratio between the longest and the shortest wavelengths is 
bounded. 

Since this is neither an ultraviolet nor an infrared truncation, what is lost are not wave- 
lengths shorter than a given length, but rather wavelengths k times shorter than the full size 
of physical space, for some integer k. 

This approximation is useful in cosmology. According to the cosmological principle, 
the dynamics of a homogeneous and isotropic space provides a good first order approx- 
imation to the dynamics of the real universe. Inhomogeneities can be disregarded in 
a first approximation. Notice that the approximation is not just a large scale approxi- 
mation, because the universe may be small at some point of its evolution. Rather, the 
truncation is in the ratio between the scale of the inhomogeneities and the scale fac- 
tor. At lowest order, we consider the dynamics of the whole universe as described 
solely by the scale factor; this ratio is a unit and a single degree of freedom is suffi- 
cient. We can then recover the rest of the theory adding degrees of freedom progres- 
sively. In the context of spinfoam cosmology, this can be obtained progressively refining 
the graph. 

A graph with a single degree of freedom is just a single node: in a certain sense, this is 
the case of usual Loop Quantum Cosmology. To add degrees of freedom, we add nodes and 
links with a coloring (Borja et al., 2012). These further degrees of freedom are a natural 
way to describe inhomogeneities and anisotropies. 

Therefore, a single graph provides a useful calculation tool in cosmology. It is pos- 
sible to generalize the spinfoam techniques for cosmology to large graphs. In a regular 
graph, which corresponds to a regular cellular decomposition, nodes and links become 
indistinguishable, and we obtain back the unique FLRW degrees of freedom (Vidotto, 
2011). For an arbitrarily large regular graph, we can define coherent states and place 
them on a homogeneous and isotropic geometry, to represent macroscopic cosmological 
States. 

Once we interpret the graph states as describing a cosmological evolution, we can com- 
pute cosmological transition amplitudes (Bianchi et al., 2010, 2011; Vidotto, 2010). This 
transition amplitude makes concrete the notion of a sum over possible histories, namely all 
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the possible 4-geometries compatible with the given three-dimensional (3D) states on the 
boundary.* 

The advantage of this formalism is that it is fully Lorentzian, the amplitudes are infrared 
and ultraviolet fine, and they have a good classical behavior as they result in being peaked 
on solutions of classical general relativity. Since the theory is non-perturbative, we are 
allowed to use these equations in the deep quantum regime, where a perturbative calcu- 
lation would exit its domain of validity. The hope is to obtain a full description of the 
quantum fluctuations at the bounce that replace the classical singularities. 

Once again, the theory is tentative and may have technical difficulties, but there are no 
conceptual obstacles, if we adopt a fully relational perspective. 


15.4 Conclusions 


The world can be described in terms of facts. Facts happen at interactions, namely when 
a system affects another system, or, in a covariant theory, when a process affects another 
process. Relational quantum mechanics is the understanding of quantum theory in these 
terms. The resulting ontology is relational, characterizes quantum theory and is more subtle 
than that of classical mechanics. 

Attributing ontological weight to the wave function is misleading. Quantum states are 
only the coding of past events that happened between two systems, or, in a generally covari- 
ant theory, between two processes. The way a system will affect another system in the 
future is probabilistically determined by the manner it has done so in the past, and physics 
is about the determination of such probabilistic relations. The quantum state codes the 
information relevant for this determination. 

The amount of information is discrete in quantum theory. The minimal amount of 
information is determined by the Planck constant. The core of quantum theory is the 
discreteness of the information. 

Cosmology is not about everything. It is about a few large scale degrees of freedom. It 
is based on an expansion in modes and neglects “short” wavelength modes, where “short” 
means millions of light years. Accordingly, in quantum cosmology the system does not 
include the observer. The observer is ourselves and our instruments, which are at a scale 
smaller than the cosmological scale. 

In contrast, it is not clear whether a quantum theory of everything — a “totology” — 
makes sense, because quantum theory describes how a system affects another system. This 
question, however, has no bearing on standard quantum cosmology, which describes how 
the large scale degrees of freedom affect our instruments. 


4 In the lowest approximation, the classical theory expresses the dynamics as a relation between the scale factor 
and its momentum. Consequently, in quantum theory, at the first order in the vertex expansion the probability 
of measuring a certain “out” coherent state does not depend on the “in” coherent state. In other words, at the 
first order the probability is dominated by the product of the probabilities of each state to exist. Each term is 
given by a sum over all the possible 4D geometry compatible with the state representing a given 3D geometry. 


This is exactly the “spinfoam version” of the wave function of the universe (Hartle and Hawking, 1983). 
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Similarly, a preferred notion of time is not needed in quantum cosmology because the 
theory is about relations between partial observables, and dynamics is the study of these 
correlations. 

Quantum gravity is the theory of the existence of a minimal length. Spacetime is (quan- 
tum) discrete. Since spacetime is dynamical, processes are spacetime regions. Neither 
space nor time are defined inside a process. Thanks to this, the application of quantum grav- 
ity to quantum cosmology cures the initial singularity and may lead to observable effects. 

Using this relational understanding of quantum mechanics and of evolution, a coherent 
and consistent formulation of quantum cosmology is possible. 


References 


Ashtekar, A. 2007. An Introduction to Loop Quantum Gravity Through Cosmology. Nuovo 
Cim., 122B, 135-55. 

Ashtekar, A. and Barrau, A. 2015. Loop quantum cosmology: From pre-inflationary 
dynamics to observations. Classical and Quantum Gravity, 32, 23, 234001. 

Ashtekar, A., Pawlowski, T. and Singh, P. 2006. Quantum nature of the big bang. Phys. 
Rev. Lett., 96, 141301. 

Bianchi, E., Rovelli, C. and Vidotto, F. 2010. Towards Spinfoam Cosmology. Phys. Rev., 
D82, 084035. 

Bianchi, E., Krajewski, T., Rovelli, C. and Vidotto, F. 2011. Cosmological constant in 
spinfoam cosmology. Phys. Rev., D83, 104015. 

Bojowald, M. 2001. Absence of singularity in loop quantum cosmology. Phys. Rev. Lett., 
86, 5227-30. 

Borja, E. F., Garay, I. and Vidotto, F. 2012. Learning about quantum gravity with a couple 
of nodes. SIGMA, 8, 015. 

Dorato, M. 2013. Rovelli’s relational quantum mechanics, monism and quantum becom- 
ing. arXiv: 1309.0132. 

Durr, D., Goldstein, S. and Zanghi, N. 1995. Bohmian mechanics and the meaning of 
the wave function. In Foundations of Quantum Mechanics: A Symposium in Honor of 
Abner Shimony Boston, Massachusetts, 19-20 September, 1994. 

Grinbaum, A. (2003) Elements of information-theoretic derivation of the formalism of 
quantum theory. Int. J. Quantum Inf., 1(3) 289-300. 

Hartle, J. B. and Hawking, S. W. 1983. Wave Function of the Universe. Phys. Rev., D28, 
2960-75. 

Hoehn, P. A. 2014. Toolbox for reconstructing quantum theory from rules on information 
acquisition. arXiv: 1412.8323. 

Oeckl, R. 2003. A ’general boundary’ formulation for quantum mechanics and quantum 
gravity. Phys. Lett., B575, 318-24. 

Penrose, R. 1971. Angular momentum: An approach to combinatorial spacetime. In Bastin, 
T., ed. Quantum Theory and Beyond. Cambridge: Cambridge University Press. 

Rovelli, C. 1991. Time in Quantum Gravity: Physics Beyond the Schrédinger Regime. 
Phys. Rev., D43, 442-456. 

Rovelli, Carlo. 1996. Relational quantum mechanics. Int. J. Theor. Phys., 35, 1637-78. 

Rovelli, C. 2002. Partial observables. Phys. Rev., D65, 124013. 

Rovelli, C. 2013. Why Gauge? arXiv: 1308.5599 [hep-th]. 

Rovelli, C. and Smolin, L. 1995. Discreteness of area and volume in quantum gravity. 
Nucl. Phys., B442, 593-622. 


316 Francesca Vidotto 


Rovelli, C. and Vidotto, F. 2008. Stepping out of Homogeneity in Loop Quantum 
Cosmology. Class. Quant. Grav., 25, 225024. 

Rovelli, C. and Vidotto, F. 2014.  Covariant Loop Quantum Gravity. Cambridge 
Monographs on Mathematical Physics. Cambridge: Cambridge University Press. 

Vidotto, F. 2010. Spinfoam Cosmology: quantum cosmology from the full theory. J. Phys.: 
Conf. Ser., 314(012049). 

Vidotto, F. 2011. Many-nodes/many-links spinfoam: the homogeneous and isotropic case. 
Class. Quant Grav., 28(245005). 

Vidotto, F. 2013. Atomism and Relationalism as guiding principles for Quantum Gravity. 
Talk at the “Seminar on the Philosophical Foundations of Quantum Gravity’, Chicago. 

Vidotto, F. 2014. A relational ontology from General Relativity and Quantum Mechanics. 
Talk at the XIth International Ontology Congress, Barcelona, October. 


16 
Cosmological Ontology and Epistemology 


DON N. PAGE 


16.1 Introduction 


In cosmology, we would like to explain our observations and predict future observations 
from theories of the entire universe. Such cosmological theories make ontological assump- 
tions of what entities exist and what their properties and relationships are. One must also 
make epistemological assumptions or metatheories of how one can test cosmological the- 
ories. Here I shall propose a Bayesian analysis in which the likelihood of a complete 
theory is given by the normalized measure it assigns to the observation used to test the 
theory. In this context, a discussion is given of the trade-off between prior probabilities 
and likelihoods of the measure problem of cosmology, of the death of Born’s rule, of the 
Boltzmann brain problem, of whether there is a better principle for prior probabilities than 
mathematical simplicity, and of an Optimal Argument for the Existence of God. 

Cosmology is the study of the entire universe. Ideally in science, one would like the sim- 
plest possible theory from which one could logically deduce a complete description of the 
universe. Such theories must make implicit assumptions about the ontology of the universe, 
what are its existing entities and their nature. They may also make implicit assumptions 
about the ontology of the entire world (all that exists), particularly if other entities beyond 
the universe have relationships with the universe. 

For example, I am a Christian who believes that the universe was created by an omnipo- 
tent, omniscient, omnibenevolent, personal God who exists outside it but who relates to 
it as His creation. Therefore, the ontology I assume for the world includes not only our 
universe but also God and other entities He may have created (such as new heavens and 
new earth for us after death). However, my assumption that God has created our universe 
as an entity essentially separate from Himself means that one can also look for a theory of 
the universe itself, without including its relationship to God, although there may be some 
aspects of such a theory that can only be explained by the assumption that the universe 
was created by God. (For example, even though a complete theory of our universe would 
necessarily imply the facts that there is life and consciousness within it, since such entities 
do exist within our universe, there may not be a good explanation of these aspects of the 
theory apart from the concept of creation by a personal God.) 
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Here I shall focus on theories of our universe as a separate entity, although near the end 
I shall speculate on some possible deeper explanations for why our universe is as it is. 

I shall define a complete theory for a universe to be one that completely describes or 
specifies all properties of that universe. I shall assume that an observer who could observe 
all properties of a universe and who has unlimited reasoning power could deduce a unique 
complete theory for that universe. (Equivalent complete descriptions of the theory I regard 
as the same theory. Depending upon one’s background knowledge and the effort needed to 
deduce consequences from logically equivalent assumptions, different equivalent complete 
descriptions may be assigned different levels of simplicity when one seeks the simplest 
complete description, but that involves subjective elements that I shall not get into here.) 

Of course, we do not know for certain what is the unique complete theory of our 
universe. What we can know and how we can know it is the subject of epistemology. 

Some of the complications of realistic epistemology arise from the limited reasoning 
powers of humans, such as the fact that we cannot deduce all the consequences of a set of 
axioms. (For example, the axioms of arithmetic and of complex numbers presumably imply 
whether the Riemann hypothesis is true or false, but so far humans have not been able to 
find a rigorous deduction of which is the case.) To avoid all the complications of limited 
reasoning power, for simplicity I shall just consider idealized cases in which observations 
are limited but reasoning powers are not. 

Despite this idealization, we are also restricted by the fact that our observations are 
limited and do not show the entire universe. A limited observation does not imply a unique 
complete theory for a universe. 

Therefore, beings like us with limited observations within the universe cannot hope to 
attain absolute certainty about a complete theory for the universe. The most we might 
hope for are posterior probabilities for different complete theories, probabilities taking 
into account our partial observations. But even the idealized observer cannot deduce the 
posterior probabilities of different theories fitting his or her partial observations without 
making subjective choices for the prior probabilities of these different theories. 

Here I shall lump all of the information accessible to the idealized observer when he 
or she assigns posterior probabilities to different theories into a single observation Ox, 
which includes memory elements of what in ordinary language one might consider to be 
many previous observations. The observation O; represents all that the idealized observer 
knows about the universe before coming up with various complete theories 7; to explain 
this observation and to which the idealized observer wishes to assign posterior probabilities 
P(T;|O,) for the theory 7; that are conditional upon the observation Ox. 


16.2 A Bayesian Analysis for the Probabilities of Theories 


Consider theories 7; that for each possible observation O, give a probability for that obser- 
vation, P(O;|T;). This of course is generically not the same as the probability P(T;|O,) of 
the theory given the observation, which is a goal of science to calculate. 
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If we assign (subjective) prior probabilities P(T;) to the theories T; (presumably higher 
for simpler theories, by Occam’s razor) and use an observation O; to test the theory, 
P(O,|T;) is then the likelihood of the theory, and by Bayes’ theorem we can calculate 
the posterior probability of the theory as 


P(T))P(OxIT)) 
YP) POT) 


P(T;|Ox) = (16.1) 


We would like to get this posterior probability as high as possible by choosing a simple 
theory (high prior probability P(7;)) that gives a good statistical fit to one’s observation Ox 
(high likelihood P(O;|T;)). 

Since a complete theory for a universe should completely specify all properties of that 
universe with certainty, one might think that it has no room for probabilities between 0 and 
1, so that all likelihoods P(O;|T;) would be 0 or 1. Then the posterior probabilities of these 
theories, given the observation Ox, would just be proportional to the prior probabilities of 
the theories 7; that give P(O;|7;) = 1. 

For observations to play a bigger role in science, we would like a way of getting a range 
of likelihoods, 0 < P(O;|7;) < 1. If the theories can give normalizable measures for the 
observations, then the normalized measures can be interpreted to be likelihoods P(O;|T;) 
that can vary between 0 and 1. 

In a classical theory, the measures could be functionals of the phase-space trajectory, 
such as the times that the system spends in different regions of the phase space. In a quan- 
tum theory, the measures could be certain functionals of the quantum state (maps from the 
quantum state to non-negative numbers). 

Another argument against likelihoods all zero or one is the following: If a specific theory 
T; leads to the definite existence of more than one observation O;, one might be tempted 
to say that P(O;|T;) = 1 for all of these observations, giving unit likelihood for any theory 
T; predicting the definite existence of the particular actual observation O, used to test the 
theory. However, this procedure would mean that one could construct a theory of maxi- 
mum (unit) likelihood, no matter what the observation turns out to be, just by having the 
theory predict that all possible observations definitely exist. But this seems to be too cheap, 
supposedly explaining everything but actually explaining nothing. 

Therefore, I propose the normalization principle or metatheory that each theory T; 
should give normalized probabilities for the different possible observations Oj that it 
predicts, so that for each 7;, the sum of P(O;|T;) over all O; is unity. 

One could view theory construction as a contest, in which theorists are given only unit 
probabilistic resources to divide at will for each theory, and then others judge the theo- 
ries by assigning prior probabilities to the theories based on their simplicity. Normalizing 
the products of these prior probabilities and of the probabilities the theories assign to the 
observation used to test the theory then gives the posterior probabilities of the theories. 

For this to be a fair contest, each theory should be allowed a unit probability to distribute 
among all the observations predicted by the theory. Of course, if the theorists cheat and 
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give unnormalized probabilities in their theories (such as assigning probabilities near one 
for more than one observation in the same theory), the judges can compensate for that by 
assigning correspondingly lower prior probabilities to such theories. However, it would 
give a more clear division between prior probabilities and likelihoods if theorists obeyed 
the proposed rules and normalized the observational probabilities for each theory so that 
they sum to one. 


16.3 Trade-Off Between Prior Probabilities and Likelihoods 


In a Bayesian analysis, we have (1) prior probabilities P(7;) for theories (intrinsic plausibil- 
ities), (2) conditional probabilities P(O,|T;) for observations (‘likelihoods’ of the theories 
for a fixed observation), and (3) posterior probabilities P(7;|Ox,) « P(T;)P(Ox|T;) for 
theories 7;, conditionalized upon the observation Ox. 

Prior probabilities are subjective, usually higher for simpler theories. The highest prior 
probability might be for the theory 7; that nothing concrete (contingent) exists, but then 
P(Ox|T,) = 0. Tz might be the theory that all observations exist equally: P(Ox|T2) = 
1/oo = 0 when normalized (e.g. modal realism). 

At the other extreme would be a maximal-likelihood theory giving P(O,|T;) = 1 for 
one’s observation O; (and zero probability for all other observations), but this seems to 
require a very complex theory 7; that might be assigned an extremely tiny prior probability 
P(T;), hence giving a very low posterior P(7;|Ox). 

Thus there is a trade off between prior probabilities and likelihoods, that is, between 
intrinsic plausibility and fit to observations. 


16.4 The Measure Problem of Cosmology 


The measure problem of cosmology (see, e.g. Linde and Mezhlumian [1], Vilenkin [2], 
Freivogel [3] and Page [4]) is how to obtain probabilities of observations from the quan- 
tum state of the universe. This is particularly a problem when eternal inflation leads to 
a universe of unbounded size so that there are apparently infinitely many realizations 
or occurrences of observations of each of many different kinds or types, making the 
ratios ambiguous. There is also the danger of domination by Boltzmann Brains, observers 
produced by thermal and/or vacuum fluctuations [5-8]. 

The measure problem is related to the measurement problem of quantum theory, how 
to relate quantum reality to our observations that appear to be much more classical. An 
approach I shall take is to assume that observations are fundamentally conscious percep- 
tions or sentient experiences (each perception being all that one is consciously aware of at 
once). 

I shall take an Everettian view that the wave function never collapses, so in the Heisen- 
berg picture, there is one single fixed quantum state for the universe (which could be a 
“multiverse’). I assume that instead of “many worlds”, there are instead many different 
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actually existing observations (sentient experiences) Ox, but that they have different posi- 
tive measures, (4p = {4(Ox), which are in some sense how much the various observations 
occur, but they are not determined by just the contents of the observations. 

For simplicity, I shall assume that there is a countable discrete set {Ox} of observations, 
and that the total measure )7, jz is normalized to unity for each possible complete theory 
T; that gives the normalized measures jx for all possible observations Ox. Then for a 
Bayesian analysis, I shall interpret the normalized measures jz; of the observation O; that 
each theory 7; gives as the probability of that observation given the theory, P(O,|T;), which 
for one’s observation Ox is the “likelihood” of the theory 7;. 


16.5 Sensible Quantum Mechanics or Mindless Sensationalism 


The map from the quantum state to the measures of observations could be non-linear. 
However, I assume a linear relationship in Sensible Quantum Mechanics [9-15] (which I 
have also called mindless sensationalism because it proposes that what is fundamental is 
not minds but conscious perceptions, which crudely might be called “sensations”, although 
they include more of what one is consciously aware of, e.g. memories, than what is usually 
called “sensations’’): 


[L(Ox) = o[A(Ox)] = expectation value of the operator A(Ox). (16.2) 


Here o is the quantum state of the universe (a positive linear functional of quantum oper- 
ators), and A(O;) is a non-negative “awareness operator” corresponding to the observation 
or sentient experience (conscious perception) Ox. The quantum state o (which could be a 
pure state, a mixed state given by a density matrix, or a C*-algebra state) and the aware- 
ness operators {A(O;)} (along with the linear relationship above and a description of the 
contents of each Ox) are all given by the theory 7}. 


16.6 The Death of the Born Rule 


Traditional quantum theory uses the Born rule with the probability of the observation O; 
being the expectation value of A(Ox) = P, that is a projection operator (P;P, = 4j,Px, no 
sum over k) corresponding to the observation Ox, so 


P(OK|Ti) = oi[Px] = (Px)i- (16.3) 


Born’s rule works when one knows where the observer is within the quantum state (e.g. 
in the quantum state of a single laboratory rather than of the universe), so that one has def- 
inite orthonormal projection operators. However, Born’s rule does not work in a universe 
large enough that there may be identical copies of the observer at different locations, since 
then the observer does not know uniquely the location or what the projection operators are 
[16-19]. 
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Why does the Born rule die? Suppose there are two identical copies of the observer, at 
locations B and C, that can each make the observations O; and O2 (which do not reveal 
the location). Born’s rule would give the probabilities Pa = o[P?] and rae = o[P3 ] if the 
observer knew that it was at location B with the projection operators there being pe and 
P8. Similarly, it would give the probabilities P° = o[P€] and P$ = o[PS] if the observer 
knew that it was at location C with the projection operators there being pe and Be : 

However, if the observer is not certain to be at either B or C, and if pe < ae then one 
should have Pe << Pp ae But there is no state-independent projection operator that 
gives an expectation value with this property for all possible quantum states. No matter 
what the orthonormal projection operators P; and P» are, there is an open set of states 
that gives expectation values that are not positively weighted means of the observational 
probabilities at the two locations. Thus the Born rule fails in cosmology. 

In more detail, consider normalized pure quantum states of the form 


lyr) = by2|12) + ba1|21), (16.4) 


with arbitrary normalized complex amplitudes bj2 and b2;. The component |12) represents 
the observation | in the region B and the observation 2 in the region C; the component |21) 
represents the observation 2 in the region B and | in the region C. Therefore, P? = PS = 
|bi2I7, and P} = PF = |boi |’. 

For Born’s rule to give the possibility of both observational probabilities being non-zero 
in the two-dimensional quantum state space being considered, the orthonormal projection 
operators must each be of rank one, of the form 


P, = |W) (Vil, Po = |w2) (wal, (16.5) 


where |y1) and |y2) are two orthonormal pure states. 

Once the state-independent projection operators are fixed, then if the quantum state is 
|v) = |W1), the expectation values of the two projection operators are (P|) = (w|Pi|W) = 
(Wilvi)(Wilwi) = Land (P2) = (w|P2lv) = (wilw2)(v2lyi) = 0. These extreme values 
of | and 0 are not positively weighted means of fa and a and of Pe and Pe for any choice 
of |¥v1) and |w2) and any normalized choice of positive weights. Therefore, no matter what 
the orthonormal projection operators P; and P» are, there is at least one quantum state (and 
actually an open set of states) that gives expectation values that are not positively weighted 
means of the observational probabilities at the two locations. Thus Born’s rule fails. 

The failure of the Born rule means that in a theory 7;, the awareness operators Aj(Ox), 
whose expectation values in the quantum state o; of the universe give the probabilities 
or normalized measures for the observations or sentient experiences Ox, as P(O;|Tj) = 
Mi(Ox) = oj[Ai(Ox)] = (Ai(Ox));, cannot be projection operators. However, the aware- 
ness operators could be weighted sums or integrals over spacetime of localized projection 
operators P;(Ox, x) at locations denoted schematically by x, say onto brain states there that 
would produce the observations or sentient experiences. 
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16.7 The Boltzmann Brain Problem 


In local quantum field theory with a definite globally hyperbolic spacetime, any pos- 
itive localized operator (such as a localized projection operator) will have a strictly 
positive expectation value in any non-pathological quantum state. Therefore, if such a pos- 
itive localized operator is integrated with uniform weight over a spacetime with infinite 
four-volume, it will give an awareness operator with an infinite expectation value. 

If one takes the integral only up to some finite cutoff time f, and normalizes the result- 
ing awareness operators, then for a universe that continues forever with a three-volume 
bounded below by a positive value, the integrals will be dominated by times of the same 
order of magnitude as the cutoff time. If at late times the probability per four-volume drops 
very low for ordinary observers, then most of the measure for observations will be con- 
tributed by thermal or vacuum fluctuations, so-called Boltzmann brains. That is, Boltzmann 
brains will dominate the measure for observations. 

If Boltzmann brains dominate the measure for observations, one might ask, “So what?” 
Could it not it be that our observations are those of ordinary observers? Or could it not 
it be that our observations really are those of Boltzmann brains? However, since Boltz- 
mann brain observations are produced mainly by thermal or vacuum fluctuations, it would 
be expected that only a very tiny fraction of their measure would be for observations 
so ordered as our observations. This very tiny fraction, plus the even smaller fraction 
of ordered ordinary observer observations in comparison with the dominant disordered 
Boltzmann brain observations, would be only a very tiny fraction of the measure of all 
observations. Thus the normalized probability of one of our ordered observations (which 
we would use as the likelihood of the theory) would be highly diluted and hence much 
smaller than those of alternative theories in which Boltzmann brains do not dominate. If 
these alternative theories do not have prior probabilities that are too small, they would 
dominate the posterior probabilities. 

In summary, Boltzmann brain domination, which is predicted by many simple extensions 
of current theories (e.g. with the awareness operators or their equivalent being obtained by 
a uniform integration over spacetime up to a cutoff that is then taken to infinity), gives a 
reductio ad absurdum for such theories, making their likelihoods very small. Surely there 
are alternative theories that avoid Boltzmann brain domination without such a cost of com- 
plexity that their prior probabilities would be decreased so much as the gain in likelihoods 
from not having the normalized probabilities of our ordered observations highly diluted by 
disordered Boltzmann brain observations. 

The Boltzmann brain problem is analogous to the ultraviolet catastrophe of late nine- 
teenth century classical physics: Physicists then did not believe that an ideal black body 
in thermal equilibrium would really emit infinite power, and physicists now do not believe 
that Boltzmann brains really dominate observations. 


16.8 Volume Weighting Versus Volume Averaging 


The approach that gives “awareness operators” as uniform integrals over spacetime of 
localized projection operators (or equivalently counts all observation occurrences equally, 
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no matter when and where they occur in a spacetime) gives an especially severe Boltzmann 
brain problem in spacetimes with a positive cosmological constant (as ours seems to have) 
with the spatial hypersurfaces having three-volumes that asymptotically grow exponen- 
tially, as in the k = | slicing of the de Sitter spacetime. At each time, counting the number 
or measure of observations as growing with the volume is called “volume weighting”. 

In 2008 I proposed the alternative of volume averaging [20, 21], which gives a contri- 
bution to the measure for an observation from a hypersurface that is proportional to the 
spatial density of the occurrences of the observation on the hypersurface. This rewards 
the spatial frequency of observation occurrences rather than the total number that would 
diverge in eternal inflation as the hypersurface volume is taken to infinity. 

Volume averaging ameliorates the Boltzmann brain problem by not giving more weight 
to individual spatial hypersurfaces at very late times when Boltzmann brains might be 
expected to dominate. However, when one integrates over a sequence of hypersurface with 
a measure uniform in the element of proper time df, one gets a divergence if the time f goes 
to infinity. One needs some suppression at late times to avoid this divergence. 

In 2010 I proposed Agnesi Weighting [22], replacing dt by dt/(1 + t*) where t is mea- 
sured in Planck units. This year I have also proposed new measures depending on the 
spacetime average density (SAD) of observation occurrences within a proper time ¢ from 
a big bang or bounce [4]. When these measures are combined with volume averaging and 
a suitable quantum state such as my symmetric-bounce one [23], they appear to be sta- 
tistically consistent with all observations and seem to give much higher likelihoods than 
current measures using the extreme view that the measure is just given by the quantum 
state. 


16.9 Is there a Better Principle than Mathematical Simplicity? 


Mathematical simplicity seems to be a reasonably good guide in science for choosing prior 
probabilities. However, once observations are taken into account with the likelihoods, the 
highest posterior probabilities do not seem to go to the mathematically absolutely simplest 
theories (such as the theory that nothing concrete exists, which has zero likelihood given 
the fact that we observe something concrete, something not logically necessary). 

Is there another principle that works better for predicting or explaining the properties of 
the actual world? 

A conjectured principle for explaining the properties of the world is the following [24]: 

The actual world is the best possible world. 

By the best possible world, I mean the one with maximum goodness. I take the goodness 
that is maximized to be the pleasantness (measure of pleasure) of conscious or sentient 
experiences, which is what I am calling their intrinsic goodness. Conscious experiences 
of pleasure (happiness, joy, contentment, satisfaction, etc.) would have positive good- 
ness. Conscious experiences of displeasure (unhappiness, pain, agony, discontentment, 
dissatisfaction, etc.) would have negative goodness. 
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Our universe seems to have enormous positive goodness, but it also seems to have 
enormous negative goodness. Our universe also seems to have a very high degree of math- 
ematical elegance and beauty. Humans can partially appreciate this, so that helps increase 
goodness. But it would seem that intrinsic goodness consciously experienced by humans 
would be higher if disasters, disease and cruelty were eliminated, even at the cost of less 
mathematical elegance and beauty for the laws of physics. 


16.10 The Optimal Argument for the Existence of God 


If the mathematical elegance of the universe were appreciated by a sentient Being outside 
the universe, that might increase the goodness of the world to a maximum, despite the suf- 
ferings within it. Goodness might be maximized if the Being had all possible knowledge, 
leading to the hypothesis of omniscience for full appreciation. Maximum goodness might 
also suggest the hypothesis that the Being has omnipotence for actualizing the best pos- 
sible world. If the Being actualizes a world of maximum goodness, one might postulate 
that the Being is a Creator and has omnibenevolence. Such an omniscient, omnipotent, 
omnibenevolent Creator would fit the usual criteria for God. 

Thus the assumption that the world has maximum goodness might suggest the conclu- 
sion of the existence of God: 

Without God, it would seem that the goodness of the universe could be increased by 
violations of the laws of physics whenever such violations would lead to more pleasure 
within the universe. However, with God, such violations might decrease God’s pleasure 
so much that total goodness would be decreased. Perhaps the actual world does maximize 
total goodness, despite suffering that is a consequence of elegant laws. 

God may grieve over unpleasant consequences of elegant laws of physics and might 
even directly experience all of them Himself (as symbolized by the terrible suffering He 
experienced in the Crucifixion), but there may be that inevitable trade off that God takes 
into account in maximizing total goodness. If God really does maximize total goodness, 
He is doing what is best. 

Let me give a draft syllogism for one form of this Optimal Argument for the Existence 
of God: 


Assumptions: 


1. The world is described by the simple hypothesis that it is the best, maximizing the 
pleasure within it. 

2. It is most plausible that either (a) our universe exists in isolation, or (b) our universe 
is created by God whose pleasure is affected by the universe and who has a nature 
determining what gives Him pleasure. 

3. Our universe could have had more pleasure. 

4. If God exists, it is possible that the total pleasure of the world (including both that 
within our universe and within God) is maximized subject to the constraint of His 
nature. 
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5. If our universe exists in isolation, 3 implies that it could have had more pleasure and 
hence the world could have been better, contradicting 1. 

6. Therefore, | and 3 imply that option (a) of 2 is false. 

7. Then 2 and 4 imply that it is most plausible that God exists and created our universe. 


Of course, the assumptions of this argument are highly speculative, so it certainly does 
not give a proof of the existence of God from universally accepted axioms. It is merely sug- 
gestive, hopefully motivating further investigations of other evidence, such as of historical 
records about the founders and key events in the development of the world religions. 

Assumption | is motivated by Occam’s razor but is modified slightly from the typical 
scientific form that one seeks theories that have the simplest mathematical formulation. 
That form seems to work well when one constrains theories by observations, but it does 
not seem to give a fundamental explanation as to why our universe appears to be described 
by relatively simple mathematics but not the simplest possible mathematics. In making 
the assumption of the best world, I assume that goodness is fundamentally given by the 
measure of pleasure (the pleasantness) of conscious experiences (with displeasure counting 
negatively). 

Assumption 2 is highly debatable, since there are many other possibilities that people 
have considered. Unless I made an expansion to include other possibilities, my argument 
might seem worthwhile mainly to those who are trying to choose between these two 
options. 

Assumption 3 does seem rather obviously in agreement with our observations, at least if 
one considers all alternative logical possibilities for our universe. 

Assumption 4 could be true in various different ways. A traditional free-will defense of 
the problem of evil might assume that God gets sufficient pleasure from having persons in 
His universe with libertarian free will (e.g. so that they can love Him freely), so that such a 
world maximizes the total pleasure of God and of His creatures despite the sufferings (dis- 
pleasures, which I am counting as negative pleasures) within the universe. I am personally 
sceptical that it is logically possible for God to create totally from nothing creatures with 
libertarian free will, so I would instead postulate that God gets sufficient pleasure from a 
universe almost always obeying highly orderly and elegant laws of nature that He gener- 
ally uses in His creation, so that the total pleasure of all conscience experiences (those of 
both God and His creatures) is maximized despite the sufferings that both God and His 
creatures also experience. But one might also consider other alternative ways to flesh out 
how assumption 4 could be true. 

Once one makes these assumptions (and perhaps others that are implicit in my argu- 
ment), then it does seem to me that the plausible existence of God as the Creator of our 
universe follows. Of course, the existence of God does not depend on these assumptions; 
God could exist even if one or more of these assumptions are false, just as I personally 
believe in His existence while being highly sceptical of one or more of the assumptions 
of nearly all of the classical arguments for the existence of God. But I do think that the 
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Optimal Argument for the Existence of God gives a somewhat new slant on why it might 
be plausible to believe in God. Furthermore, although one can do a Bayesian analysis for 
theories of our universe itself without reference to God, postulating the existence of God 
might help explain certain features of our best theories of our universe, such as why they 
are enormously more simple than they might have been but yet do not seem nearly so 
simple as what is purely logically possible. 


16.11 Conclusions 


I propose that in a Bayesian analysis in which the probability of a particular observation is 
used as the likelihood of the theory, the sum of the probabilities of all observations should 
be unity. 

It seems plausible that one can find cosmological theories that avoid Boltzmann brain 
domination and explain the high order of our observations, although the measure problem 
is not yet solved. 

The best theories do not seem to be the absolute simplest, so presumably something 
other than simplicity is maximized, such as goodness. The best explanation for the actual 
world may be that it is the best possible world. This assumption, with suitable assumptions 
about God’s nature, such as love of mathematical elegance and of sentient beings, leads to 
the Optimal Argument for the Existence of God. 

The world may maximize goodness only by including a God whose appreciation of 
the elegant universe and sentient beings He created overbalances the sufferings within the 
universe. 
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Quantum Origin of Cosmological Structure and 
Dynamical Reduction Theories 


DANIEL SUDARSKY 


17.1 Introduction 


Contemporary cosmology includes inflation as one of its central components. It corre- 
sponds to a period of accelerated expansion thought to have occurred very early in cosmic 
history, which takes the universe from relatively generic post Planckian era conditions to 
a stage where it is well described (with exponential accuracy in the number of e-folds) 
by a flat Robertson—Walker space-time (which describes a homogeneous and isotropic 
cosmology). 

It was initially proposed to resolve various naturalness problems: flatness, horizons, and 
the excess of massive relic objects, such as topological defects, that were expected to popu- 
late the universe according to grand unified theories (GUT). Nowadays, it is lauded because 
it predicts that the universe’s mean density should be essentially identical to the critical 
value, a very peculiar situation which corresponds to a spatially flat universe. The exist- 
ing data support this prediction. However, its biggest success is claimed to be the natural 
account for the emergence of the seeds of cosmic structure in terms of primordial quantum 
fluctuations, and the correct estimate of the corresponding microwave spectrum. 

This represents, thus, a situation which requires the combined application of general 
relativity (GR) and quantum theory, and that, moreover, is tied with observable imprints left 
from the early universe, in both the radiation in cosmic microwave background (CMB), and 
in the large scale cosmological structure we can observe today. It should not be surprising 
that, when dealing with attempts to apply our theories to the universe as a whole, we should 
come face to face with some of the most profound conceptual problems facing our current 
physical ideas. 

In fact, J. Hartle had noted long ago [28] that serious difficulties must be faced when 
attempting to apply quantum theory to cosmology, although the situations that were envi- 
sioned in those early works corresponded to the even more daunting problem of full 
quantum cosmology, and not the limited use of quantum theory to consider perturbative 
aspects, which is what is required in the treatment of the inflationary origin of the seeds of 
cosmic structure. This led him and his collaborators to consider the Consistent Histories 
framework, which is claimed to offer a version of quantum theory that does not rely on any 
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fundamental notion of measurements, as the favored approach to use in such contexts. We 
will have a bit more to say about this shortly. 

As we will see, the treatment of this question forces us to address the so-called measure- 
ment problem of quantum mechanics, in a particularly clean situation where, not only the 
issues appear in a rather simple form, but also where most of the distracting complications 
that have served to elude dealing with the fundamental problems or to convince ourselves 
that we have a satisfactory resolution of them are simply absent. This chapter is based on 
previous works with various colleagues devoted to addressing the many issues that arise 
when confronting these questions. Specially significant in the following, are several articles 
[4, 6, 51]. 


17.2 Observations 


The observational data providing support to our current views on the issue of structure in 
cosmology are the large scale galaxy surveys and the very precise temperature anisotropies 
of the CMB. We will focus here on the second, as they are observed now on the celestial 
two-sphere, which, in turn, are related to the inhomogeneities in the Newtonian potential 
W! on the last scattering surface (the region from where the photons that reach us now 
were emitted just before the time in the universe’s evolution when the plasma of electrons 
and protons became hydrogen atoms and thus transparent to photons), 


6T 1 “ 
= (9,9) = ZV (np, xp). (17.1) 
To 3 


where 57 (0, ~) is the difference between the mean temperature Ty = 2.725K of the CMB, 
and that observed at the coordinates (6, py) on the celestial sphere, np is the (conformal) 
time of decoupling (i.e. the transformation of the plasma into hydrogen atoms), and xp are 
spatial coordinates of the point, along that celestial direction, on the last scattering surface. 

The variations of the Newtonian potential over the last scattering surface, in turn, are 
indicative of the departures from homogeneity and isotropy in the density of matter at that 
moment in the evolution of the universe, and the regions with higher densities represent 
the seeds of structure that will grow as a result of gravitational accretion. 

The data are described in terms of the coefficients aj, of the multipolar series expansion 
of the sky temperature map: 


6T 
moe = >| aim Yim @. 9), (72) 


Im 


! The Newtonian potential, is, as we shall see shortly, one of the metric perturbations describing the departure of 
the space-time geometry of the universe from a state of perfect homogeneity and isotropy. It receives this name 
because, in the limit case of a static universe, this perturbation ends up playing the role of the gravitational 
potential in the Newtonian theory of gravity to which general relativity reduces, in the limit of slow particles, 
weak gravity, and sources with pressures that are much smaller than their densities. 
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where the coefficients are then given by 


ST . 
1m = Th P)V im, p)d&. (17.3) 


Here Yj,(0, ~) are the spherical harmonics. The different multipole numbers / correspond 
to different angular scales; low / to large scales and high / to small scales. 
It is customary to present these data using the orientation averaged quantities given by 


1 
Cc) = —— 2 17.4 
(=F 2 le (17.4) 


At the theoretical level, the quantities aj, are then given by expanding the Newtonian 
potential in its Fourier components 7, and using Eq. (17.1) to obtain the expression: 


Ani! f dek | eA 
fim = —— rye RD in OAH) Yim), (17.5) 
with jj(kKRp) the spherical Bessel function of order /; np is the conformal time at the end of 
inflation when the inflation field is supposed to have decayed into ordinary matter, called 
the time of reheating, and Rp the co-moving radius of the last-scattering surface. 

Between nr and np, the inhomogeneities present at the end of inflation evolved by well 
known mechanisms. The resulting change is encoded in the so-called transfer functions 
A(k). 

The data that are obtained by the modern observations culminating with the Planck satel- 
lite project, and that correspond to the radiation emitted at time np, are represented in 
Figure 17.1 below. 

The anisotropies in the temperature of the CMB represent the first observable traces of 
the primordial inhomogeneities present at the time of decoupling (the time of the formation 
of the first atoms), and represent the seeds which eventually would evolve into all the 
structure in our Universe: galaxies, stars, planets, etc. 


17.3 Brief Review of the Standard Treatments 


We now offer a very brief review of the methods and ideas used in the standard accounts of 
inflation and the generation of the seeds of cosmic structure from primordial quantum fluc- 
tuations. Much more detailed accounts abound in the literature, and the reader is directed 
to those works, if interested. A highly recommended recent one is Weinberg [56]. 

The starting point of the analysis is a Robertson—Walker (RW) space-time background 
characterized by the space-time metric 


dS? = a(n)? {—dr? + dx} (17.6) 
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expressed in terms of the so-called co-moving spatial coordinates x ( where a fixed specific 
value of x is attached to each co-moving observer) and the conformal time is related to 
the proper time along the world lines of such observers by dt = a(n)dn. The scale factor 
a increases characterizing the universe’s expansion, and initially does so at an accelerated 
rate, a situation usually described by saying it represents an inflating universe. This acceler- 
ated growth is the result of the gravitational dynamics under the influence of the potential 
associated with a scalar field which acts, for a while, like a cosmological constant. This 
scalar field is called the inflaton, and it is characterized by a classical background field 
¢ = ¢0(), in a so-called slow roll condition so that the scale factor behaves approxi- 
mately as a(n) = ae (during this period the “conformal time” takes negative values, 
i.e. the inflationary regime corresponds to n € (—7,70),N0 < 0 where —7 is the time 
when inflation began, and no, very close to 0, is the time inflation ends and reheating 
occurs). 

On top of this background, one considers quantum fluctuations: that is, fluctuations 5¢ of 
the inflaton field, and metric fluctuations such as the so-called Newtonian potential Y, and 
the tensor or gravity wave terms d/h; (representing the fluctuation of the spatial part of the 
metric tensor). These are treated quantum mechanically and assumed to be characterized 
by the “vacuum state” |0) (essentially the so-called Bunch—Davies vacuum)*. From these 
“fluctuations”, one argues, the primordial inhomogeneities and anisotropies emerge. 

The analysis leads to a remarkable agreement with observations. The actual analysis 
consists of the evaluation of the two-point correlation function of the inflaton field 3 in the 
the Bunch—Davies vacuum state (0|6¢(x)d@(y)|0) at the end of inflation, and the extraction 
of the power spectrum by writing 


(0/56 (x)5h(9)0) = i, PRP (He ED (17.7) 


The result is an essentially flat spectrum P(k) = ak~>, which is then modified by the 
plasma-physics effects. We note that the characteristic oscillations we see in Figure 17.1 
are the result of these late time plasma physics effects in the matter which originated from 
reheating, which are encoded in the transfer functions A(k), on top of the primordial flat 
spectrum. The physics of these acoustic oscillations is well understood, and will be mostly 
ignored in the rest of this chapter. 

The remarkable fact is that the theory fits extremely well with the observations, a sit- 
uation that can lead to an understandable and yet dangerous kind of complacency. “The 
theory works, that is it. What else do we want?” However, let us consider the follow- 
ing: according to this account, the Universe was homogeneous and isotropic, (both in the 
part that could be described at the “classical level”, as well as that in quantum level) as 


2 This state is the one that would correspond to the standard Minkowski vacuum in the limit of vary large T in 
which the expansion rate would become negligible. 
3 More precisely of a linear combination of 5¢ and 5W, known as the Sasaki-Muchanov variable. 
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Figure 17.1 The figure shows the observed values of the C;’s together with the theoretical expecta- 
tion obtained using the best fit of a handful of adjustable cosmological parameters. (Image: Courtesy 
of ESA/ Planck Collaboration). 


a result of the early stages of inflationary process.* However, we end up with a situation 
which is not homogeneous and isotropic, as it includes the primordial inhomogeneities 
which will result in our Universe’s structure and the conditions that permit our own 
existence. 

How does this happen if the dynamics of the closed system does not break those symme- 
tries? If we want to claim we have an explanation for the “origin” of cosmic structure we 
need to understand the corresponding transition, and answer the above question in a fully 
satisfactory way. If our currently accepted theories really do the job, then fine. However, if 
they do not, we can take this as a starting point for further inquiry. 

In fact we can find this issue identified as an open problem in the recent book [56], 
which, on page 476 states: “... the field configurations must become locked into one of an 
ensemble of classical configurations with ensemble averages given by quantum expectation 
values.... It is not apparent just how this happens ...”’. 

A similar issue was considered by N. F. Mott in 1929 concerning the a decay of 
a radioactive nucleus [34]. He seemed to have thought that he resolved it and various 
researchers now seem to share that view. The issue is, what did he actually solve? He con- 
sidered, within the context of non-relativistic quantum mechanics, the problem of a J = 0 
nucleus placed at (x = 0) initially in an excited state |wt) which is spherically symmetric, 
and ready to decay into an unexcited nucleus |W), plus an a particle in state |Eq), which 


4 This is in fact true except from possible remnant features from the pre-inflationary regime which must be 


suppressed by e—, with N the number of e-folds of inflation generally expected to be at least of order N ~ 70, 
and which will be ignored from now on. 
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is also spherically symmetric. Next, he included into the problem two hydrogen atoms 
with their nuclei fixed at positions a)and dz, and their corresponding electrons in the cor- 
responding ground states. The analysis proceeded by computing the degree of alignment 
of the location of the atoms and the initial nucleus (i.e. @2 = cay, with c real), if both 
atoms are to be excited by the outgoing a particle. The result is that the probability of both 
atoms getting excited is negligible unless there is a large degree of alignment, a result that 
according to Mott fully explains the fact that the a leaves straight line marks in a bubble 
chamber. 

Is this therefore an example of a situation in which an initial state possessing spherical 
symmetry |wt) evolves into a final state lacking such symmetry, despite the assumption 
that the Hamiltonian (governing the decay |wt) > |w°) | Sq) and the @ particle evolution) 
is symmetric under rotations? 

The answer of course is no. As we noted, the initial setting of the problem involves, 
besides the excited nucleus, two unexcited atoms, with localized nuclei, which break the 
rotational symmetry. In fact, the analysis that was carried out by Mott is not based on just 
the Hamiltonian for the evolution of the free w particle, but rather on the Hamiltonian for 
the joint evolution (including the interaction) of the a particle and the two electrons corre- 
sponding to the two localized hydrogen atoms. In fact, the projection postulate associated 
with a measurement is also playing a fundamental role in the evaluation of the probabili- 
ties, through the projection of the evolved state on the subspace corresponding to the two 
atoms being excited. In essence, the atoms, therefore, are taken to act as symmetry breaking 
detectors in the problem. Thus, if one replaces these atoms by some hypothetical detectors 
whose quantum description corresponded to spherically symmetric wave functions, each 
one with support, say, on a thin spherical shell with radius r;, and then performs the anal- 
ogous analysis one would be led to expect a spherical pattern of excitations rather than the 
symmetry breaking straight lines. That is, we would obtain a certain probability for finding 
the two detectors corresponding to the shells i” and j in an excited state, and one could 
not have obtained any result that indicated a breakdown of the rotational symmetry. 

The problem we face in the cosmological inflationary situation is thus clearly related 
to the “measurement problem” but in an aggravated form, because detectors, and beings 
capable of creating them, can only exist after the development of the inhomogeneities 
that generate galaxies, stars, and heavy chemical elements. Thus one cannot even rely on 
detectors or observers playing their standard role in the Copenhagen interpretation. 


17.4 A Simplified Model: Mini-Mott 


In order to help us consider conceptual issues, and the possible resolutions in a very clear 
and transparent setting, we find it useful to use a problem where the technicalities are all 
but absent. One should consider this example when contemplating any proposed solution 
to the question of the breaking of the quantum symmetries that one wishes to apply to 
the specific cosmological situation under consideration. In particular, one should use it as 
a test ground when considering the various proposals described in the section “The most 
common answers” below. 
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Consider thus a two level detector |—) (ground) and |+) (excited), and take two of them 
located at x = x; and x = —x,. They are both initially in the ground state. Take a free 
particle with initial wave function w(x, 0) given by a simple Gaussian centered at x = 0 
(so the whole set up is symmetric w.r.t x > —x). 

The particle’s Hamiltonian: Hp= p /2M while that of each detector is 


Fy = fy ® 14) (419 — |-)(-19}. (17.8) 
where i = 1, 2. The interaction of particle and detector | is 


50x — 1h) @ (4) P(-1 +O @b (17.9) 
Vi 
with a similar expression for the Hamiltonian term describing the particle’s interaction with 
detector 2. 

Next we consider the evolution of the system as given by Schrédinger’s equation for the 
fully symmetric initial state: 


Hp| = 


U(0) = D> W(x, 0) Ix) @ |-) @ |-)® (17.10) 


and it is clear that after some time f we will have: 


WO) = Do Wi, Dx) @ 14) @|-)P + Yo wo DI) @|-) @ 14) 7.1) 


+ >) vol, Dx) @ I=) @1-) + D2 vo Dla) @ 14) @ 4) (17.12) 


x 

One might now interpret the last two terms easily: no detection and double detection 
(involving a bounce, and characterized by a small number O(g)). Also, we could think 
the first two terms indicate that the initial symmetry was broken with high probability: 
either detector 1 was excited or detector 2 was. That is, one might think that by adopting 
a standard interpretation (Copenhagen?) the puzzle is solved. Let us reconsider this. The 
symmetry was just broken and we can not pinpoint where? The problem can be seen by 
considering instead the alternative state basis for the detectors (or as it is often called an 
alternative “context’’) 


|D) = j\—y) @ \-) (17.14) 
IS) = 1H) @ 1} $19 @ 14) (17.15) 


/2 


sly @ |-)® — |-) @ [4)] (17.16) 


In fact this basis is more convenient for describing issues related to symmetries of the 
problem. It is then easy to see that the x — —x and | — 2 symmetry of the initial setting 
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and of the dynamics prevents the excitation of an asymmetric term. The issue is then: can 
we or can we not describe things on this basis? And, if not, why not? 

An experimental physicist in the laboratory would perhaps say he/she sees no problem, 
he/she has many things that in practice (for all practical purposes (FAPP) as J. Bell would 
say) indicate one should use the other basis (i.e. one knows that the detectors are always 
either excited or un-excited. One never perceives them in any state of superposition). How- 
ever, the measurement problem is present here but takes the following form: exactly how 
does our theory account for that experience of our experimental colleague? The fact is that 
most often we just do not care. 

However, if we now have to consider, as in cosmology, a situation where there are no 
experimentalists, and nothing else in the universe, we simply do not know what to do. In 
that situation, why would we believe the conclusions drawn in the first context (or choice 
of basis) but not those of the second? In other words, how do we account for the breakdown 
of the symmetry? 


17.5 The Most Common Answers 


The question we are considering can be restated as: How is it that the primordial inhomo- 
geneities that act as the seeds of cosmic structure emerge from the quantum fluctuations in 
the inflaton field, if the state of that field is supposed to be a homogeneous and isotropic 
vacuum state, and the field dynamics does not break that symmetry? Here we reproduce 
some of the most common answers one tends to obtain when presenting colleagues with 
these issues, followed by a very brief characterization of what we see as their major short- 
comings. 


1. As in all QM situations, take into acount that “we make a measurement’, and that 
breaks the symmetry 

Even ignoring all the standard issues that come with the measurement problem in Quantum 
Theory, we must be aware that taking this view amounts to saying that the conditions that 
made possible our own existence have to be considered as being, at least in part, the result 
of our own actions. This would represent a problematic situation of circular causation, to 
say the least. 


2. Environment-induced decoherence, possibly supplemented by a many worlds inter- 
pretation. 

This approach requires, as a starting point, the identification of a collection of degrees 
of freedom (DoF) of the systems that are taken to act as an “environment” (and then the 
reduced density matrix is obtained by retracing over those DoF). In the standard presen- 
tation, such identification entails using our limitations to “measure things”, as part of the 
argument that selects what will be considered as the environment. That, again, involves 
using the peculiarities of the human condition as part of the argument explaining the 
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emergence of the conditions that made possible our own existence. One way of charac- 
terizing the problem is that we need what is often called a “third person description” of 
the emergence of primordial inhomogeneities. I think most people would object to any 
explanation about the extinction of the dinosaurs that depended, even partially, on the 
argument that relatively large mammals could not possibly survive in a planet dominated 
by such efficient predators. That is, we would not want to explain the emergence of the 
conditions that made possible our own existence (that is the emergence of structure, or the 
extinction of the dinosaurs) based, even in part, on our specific weakness (be it the obser- 
vational limitations of humans here and now to observe certain aspects of the cosmos, 
or the lack of size and strength, or perhaps speed, of mammals to successfully confront 
dinosaurs). 

The fact that a successful decoherence indicates that the density matrix becomes diag- 
onal, does not tell us that the situation is now described by one element of the diagonal 
density matrix, but it is still described by all of them, and as such the situation is still 
symmetric. One needs something like the many worlds interpretation (MWJ) to say that 
somehow one is dealing with a multiplicity of realized alternatives and that each one 
corresponds to one element of the diagonal in the reduced density matrix. 

However, MWI (which seems to have other drawbacks) requires some criteria to deter- 
mine the alternatives, or basis, that characterize the word’s multiplicity. What would play 
that role (i.e. selecting the preferred basis) in the situation at hand, if our limitations cannot 
be used as part of the explanation? 

Even Zurek acknowledges: “The interpretation based on the ideas of decoherence and 
ein-selection has not really been spelled out to date in any detail. I have made a few half- 
hearted attempts in this direction, but, frankly, I was hoping to postpone this task, since the 
ultimate questions tend to involve such anthropic attributes of the ‘observership’ as percep- 
tion, awareness, or consciousness, which, at present, cannot be modeled with a desirable 
degree of rigor.’ [59]. However in a recent article [58], Zurek seems to argue in a differ- 
ent direction based on a proposal called “Quantum Darwinism’, which has been severely 
criticized, as shown, for instance in Kastner [31]. 

In fact, in the cases of symmetric situations, one faces an extra specific problem with 
the issue of basis selection: Consider a standard EPR situation, corresponding say to the 
entangled two-particle state that results from the decay of a J = 0 particle into two identical 
particles with spin 1/2 moving in opposite directions along the z axis of our laboratory’s 
coordinate system. Let {|-+;x)", |—;.x)")} be the basis if eigenstates of the component x 
of the spin for particle 1 (and similarly for particle 2). The singlet state that results from 
the decay of the system is: 


W) = 1/V2){|432) @ |-3x)? = |-3) @ [43.2 } (17.17) 


This entangled state is clearly invariant under rotations about the z axis. Let us say we 
consider now that the spin of particle 2 should be taken as an environment. We then evaluate 
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the reduced density matrix for particle 1, (by tracing over the DoF of particle 2). One then 
finds: 


p™ = (1/2){|+3x) (+541 + |-3x)(-s.al} = (1/2) (17.18) 


This matrix is, however, diagonal in all possible bases for system 1. Thus, environmental 
decoherence does not offer a criterion for choosing the basis. 

There is in fact a simple theorem, proved first in [6], which shows that this problem is 
generic: 


Theorem: Consider a quantum system made of a subsystem S$ and an environment E, with 
corresponding Hilbert spaces Hs and Hg so that the complete system is described by states 
in the product Hilbert space Hs ® Hg. Let G be a symmetry group acting on the Hilbert 
space of the full system in a way that does not mix the system and environment. That is, 
the unitary representation O of G on Hs ® Hg is such that Vg € G, O(g) = 08 (g) ® oF (g), 
where OS (g) and OF (g) act on Hs and Hg, respectively. Let the system be characterized 
by a density matrix 6 which is invariant under G. Then the reduced density matrix of the 
subsystem is a multiple of the identity in each invariant subspace of Hs. 


This result shows that in these cases decoherence is not even helpful in selecting a pre- 
ferred (or pointer) basis. 


3. Consistent (or de-cohering) histories 

We believe the consistent histories approach has some serious problems in general, but in 
any event, in the particular case at hand the answer we obtain depends on the questions 
we ask. In particular, we can use the approach to conclude that, with probability 1, our 
universe today is homogeneous and isotropic (for details see [54]). 


4. This is ‘“‘just philosophy” 

It seems clear that in this conference, in contrast with various others I have attended, I 
would not need to argue that conceptual clarity, as provided by philosophical consider- 
ations, is not something to be dismissed in the pursuit of a deeper understanding of the 
physical world. 


17.6 Collapse Approach 


We have argued above that the existing explanations for the generation of the primordial 
cosmic inhomogeneities as results from quantum fluctuations of a field in a completely 
homogeneous and isotropic state can not be considered fully satisfactory. On the other 
hand, it seems clear that overlooking that issue, inflationary account leads to a remarkable 
agreement with observations. It is thus hard to dismiss the proposal as a whole. We then 
take the conservative approach which is to argue that the overall scheme is correct but that 
there are some missing elements. As the issue seems to be connected with the measurement 
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problem in general we feel it is natural to consider a modification of quantum theory which 
in principle seems capable of addressing the point. To many people any suggestion of 
modifying quantum mechanics seems something not far short of heresy. Of course the point 
is that it is perhaps the best tested theory in the history of physics, with an extraordinary 
number of applications in a large range of systems. How can we seriously consider such 
an adventurous proposal? 

Well, to start with, we must recognize that the situation we face here is rather unique in 
that it involves i) quantum theory, ii) gravity and iii) actual observational data (as pointed 
out to me by J. Martin), and thus it would not be very surprising if some aspects of physics 
which have not been previously observed might show up within this context for the first 
time. 

The issue is that we need to be able to point to some physical process, that occurs in time, 
when trying to explain the emergence of the primordial inhomogeneities and anisotropies; 
that is, the seeds of cosmic structure. After all emergence means precisely: Something that 
was not there at a time, is there at a later time. We need to explain the breakdown of the 
symmetry of the initial state, when the dynamics that results from the action of the scalar 
field in interaction with the space-time metric have no feature capable of doing so, just as 
in the mini-Mott example. The basis of our approach is thus the observation that theories 
incorporating something like spontaneous collapse of quantum states can do this. 


Collapse theories: There is an important body of previous work on dynamical collapse 
theories starting with the early proposals by Pearle, Ghirardi, Rimini and Weber, and a long 
list of further developments [21—27, 38, 40-44, 46] including the ideas linking the issue 
with aspects of quantum gravity by Diosi and Penrose [9—18, 47-50]. There are also some 
recent advances to make it compatible with special relativity [2, 19, 45, 55]. A recent work 
by Weinberg indicates that the issues underlying such a proposal are starting to resonate 
with the larger physics community [57]. 

We propose to address the issue at hand by adding, to the standard inflationary 
paradigm, a quantum collapse of the wave function considered as a self induced pro- 
cess, such as that envisaged in those proposals. Here we will illustrate these ideas 
with one rather well developed proposal formulated in the context of ordinary quan- 
tum mechanics, known as CSL [42] in an “adaptation” to the situation at hand. This 
analysis was carried out in [4], while almost simultaneous treatments based on dif- 
ferent adaptations of CSL to the problem were carried out in [32] and [7]. We 
must of course acknowledge that a truly viable resolution of the question at hand 
would need to be formulated within a fully generally covariant version of a collapse 
theory. 

Let us first briefly consider how such a modification of quantum theory, involving 
dynamical collapses, would fit with our current views and physical ideas? This is a very 
delicate question and its careful exploration would no doubt require much more than what 
has been achieved so far. However, to see the possibilities let us recall that, in connection 
with quantum gravity, there are some serious issues and conceptual difficulties that are still 
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outstanding, and that it would not be so surprising if their resolution would involve mod- 
ifications that would allow, or even require, the incorporation of collapse-like effects. The 
two major issues in this regard are: 


1. The problem of time: All canonical approaches to quantum gravity, including the 
Wheeler—de Witt proposals and the loop quantum gravity program lead to a timeless 
theory. The issue is closely tied to the diffeomorphism invariance of GR, which trans- 
lates into the requirement that at the quantum level different slices of space-time be 
regarded as gauge equivalent. 

2. Recovery of space-time: More generally, we do not know how to recover space-time 
from canonical approaches to quantum gravity. 


Solutions to 1 usually start by using some dynamical variable as a physical clock and 
then considering relative probabilities (and wave functions). It seems that in those con- 
siderations one recovers only an approximate Schrédinger equation with corrections that 
violate unitarity [20]. Could something like this lie at the bottom of collapse theories? 
Regarding 2, we note that there are many suggestions indicating space-time might be an 
emergent phenomenon (see for instance [3], [30], and [52]). 

How should we think about general relativity and its emergence from a deeper, quantum 
gravity theory? What is the nature of what in GR we describe as a classical space-time? 
These are very hard questions and I would not attempt anything close to a solid answer, 
in part because in any attempt to do so with some rigor one would need to have at hand 
a fully satisfactory and workable theory of quantum gravity, something we do not have at 
this time. However, it is often useful in physics to consider analogies. In this case I want to 
illustrate the nature of what we might be facing with the questions above, using a hydro- 
dynamic analogy: A fluid is often described in terms of the Navier-Stokes (NS) equations 
characterizing the velocity field of the liquid elements. The fluids are also characterized by 
the densities and pressures satisfying some equation of state. We know, however, that such 
macroscopic characterization of the fluid, in which it is described by a simple and well 
formulated differential equation, is only an approximation, and that, at a more fundamen- 
tal level, the fluid is made of individual molecules (which are themselves made of atoms, 
and so on). We know then that notions such as fluid density or fluid-element’s velocity are 
emergent characterizations (whose regime of validity is limited), and that the NS equation 
is not only an approximation, but that it might be grossly violated in suitable circumstances 
(we can think of water passing through a very hot region and having part of it boil, or the 
breaking of waves in the sea). 

It might well be, therefore, that the metric characterization of space-time emerges from 
a deeper quantum gravity regime in the same manner as the fluid description emerges 
from the atomic or molecular level characterization of its components. In that case the 
equations governing the macrodynamics of space-time, that is Einstein’s equations, would 
be approximate characterizations of the bulk dynamics, but not a truly fundamental law of 
nature; just as the NS equations. Moreover, concepts like space and time would simply fail 
to exist at the fundamental level, just as fluid velocity or local density do not exist at the 
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subatomic level, which we know characterizes more accurately than the hydrodynamical 
level, the fundamental nature of the fluid. 

In that case, the level at which one can talk about space-time concepts is the classical 
description. However, some traces of the quantum aspects might appear as deviations from 
the simpler effective level characterization, just as the forming of the foam at the breaking 
of the sea waves reflects the underlying molecular constitution of the fluid. Those traces 
might, according to the ideas we want to study, include behaviors that would look like a 
quantum mechanical collapse of the wave function (which after all is based on a classical 
characterization of space-time, i.e. a single particle wave function depends on the space 
and time coordinates, and so does the operator describing a quantum field in relativistic 
quantum field theory). 


17.6.1 General Setting 


We now turn to describe the setting within which we will address the issue of the emer- 
gence of the primordial cosmological inhomogeneities from quantum fluctuations during 
inflation. The general idea, as we discussed above, is that at the quantum level, space-time 
and thus gravity, are very different from their classical characterization, and that at “large 
scales” those quantum aspects become manifest as small traces of its true nature, and that 
such traces include effects that look like a collapse of the quantum wave function matter 
fields. 

Below we discuss a formal implementation of these ideas in the context of the relatively 
simpler schemes involving discrete collapses. 

We take then the view that the inflationary regime is one where gravity already has a 
good classical description, and that Einstein’s equation is generally satisfied (i.e. not always 
and not in general in an exact manner). However, the situation is such that matter fields still 
require a full quantum treatment. The setting will thus naturally be semi-classical Einstein’s 
gravity (more precisely, we will rely on the notion of semi-classical self-consistent config- 
urations (SSC) developed in [8]). 


Definition: The set {g,(x), (x), 7 (x), H, |&) € H} represents a SSC if and only if g(x), 
7 (x) and H correspond to a quantum field theory constructed over a space-time with metric 
Suv (x) and the state |&) in H is such that 


Guvlg(x)] = 8 G(E|T v[g(@), @), FILE). (17.19) 


This corresponds, in a sense, to the GR version of the Schrédinger-Newton equation [9]. 
To this setting we want to add, in order to describe the transition from a homogeneous 
and isotropic situation to one that is not, an extra element: THE COLLAPSE OF THE 
WAVE FUNCTION. That is, besides the unitary evolution describing the change in time of 
the state of a quantum field, there will be, sometimes, spontaneous jumps. For instance, the 
vacuum state of the quantum inflaton field, seen as the product of the individual harmonic 


Quantum Origin of Cosmological Structure 343 


oscillator states for each mode, could undergo a spontaneous transition to a state where one 
of such oscillators becomes excited: 


10); ® 10) ® |0) ey @ - 02. > «| E)ey @ 10) ky @ [O)gy @ (17.20) 


We must note here that semi-classical GR was deemed to be un-viable in the work [37] 
which, however, only considered the possibility of regarding the theory, and in particular 
the equation, as holding always in a 100 per cent precise manner, and thus precluded the 
consideration of spontaneous collapse of the wave function. The situation differs dramat- 
ically once we incorporate the possibility of self induced collapse of the wave function. 
A careful analysis of these issues can be found in [5], in which it was shown that the 
conclusions about the un-viability of semi-classical GR were rather premature. 

Again, the view is that there is an underlying Quantum Theory of Gravity (probably with 
no notion of space or time). However, by the “time” we can recover space-time concepts, 
the semi-classical treatment is a very good one, and its regime of validity includes the 
inflationary regime as long as the curvature is not too large (i.e. R << 1/ E anck: & condition 
that is supposed to hold in the inflationary regime). Moreover just as under some conditions 
we find some departures from the exact Navier-Stokes equation in the behavior of a fluid 
(breaking sea waves), under some conditions there are quantum collapses that might occur 
in association with some departures from Einstein’s equation. 

Now, we turn again to the SSC formalism and note that in order to describe the collapse 
we must go beyond the transition Eq. (17.20). That is, a collapse will be a transition from 
one SSC to another, not simply from one quantum state to another. The point is that a 
change in the quantum state, generically, implies a change in the expectation of the energy 
momentum tensor, and then, according to Eq. (17.19), a change in the space-time and 
therefore in the Hilbert space where the state “lives”. Clearly the complete characteriza- 
tion of such a process becomes rather complex. An individual collapse then is associated 
with the “gluing’”> of two space-times along the collapse hypersurface, while imposing 
certain continuity requirements. It is clear that Einstein’s equations would not hold exactly 
at such junctures, and we take this as an indication that the quantum aspects of gravitation 
that have been ignored in arriving at the classical characterization of space-time, must be 
playing some role in the collapse process, just as the molecular dynamics of the fluid con- 
stituents play a role in the breaking of sea waves and the breakdown of the NS equation in 
association, say, with the generation of foam during such a process. The formalism is, in 
a sense, similar to the so called Israel’s matching prescription developed to describe thin 
shells in general relativity [29]. For a detailed analysis of these issues see [8]. 

The discussion above refers to the discrete versions of spontaneous collapse theories, 
such as those embodied in the GRW proposal, and one would have to modify the treatment 


5 This is how one refers to the process of producing a new space-time by joining two space-times with boundary, 
along the corresponding boundaries. This might be done, provided that the induced metric on the corresponding 
boundaries of both manifolds are “the same” (i.e. there is an isometry between the boundaries). 
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in an appropriate manner, using adequate limiting procedures and integral representations 
of the dynamics, when dealing with continuous versions such as the CSL theory. 


17.6.2 Practical Treatment of the Problem 


In principle one would use the SSC formalism in treating the collapse of each individual 
mode of the inflation field. This would be extremely cumbersome and impractical. Further- 
more, it is clear that things would become even more complex in the context of continuous 
collapse dynamics. We will thus use a simpler scheme that uses a single quantum field 
theory (QFT) construction rather than the multiplicity of constructions that are required 
following strictly the SSC prescription. We have checked that this is equivalent, at the 
lowest order in perturbation theory, to the one based on SSC [8]. 

We again split the treatment into that of a classical homogeneous (“background”’) part 
and an inhomogeneous part (“fluctuation”). That is, we write the physical metric as gap = 
ge), +6 Zab, and similarly we write the scalar field as 6 = ¢9+46@. In both cases the first term 
represents the homogeneous and isotropic background, and the second the perturbation 
containing the deviation from such a symmetric state of affairs.° The background is taken 
again to be a flat Friedman—Robertson universe, and the homogeneous scalar field ¢o(7). 
In the strict SSC treatment this corresponds to the zero mode of the quantum field (i.e. the 
mode of the quantum field which has no dependence on the spatial coordinates). 

The main difference in treatment between our approach and the standard ones will con- 
cern the treatment of the spatially dependent perturbations, which will be subject not only 
to a quantum treatment but also to a collapse dynamics. Furthermore, according to the 
ideas previously discussed, in our approach we quantize the scalar field but not the metric 
perturbations. 


17.6.3 Continuous Spontaneous Localization 


Here, we offer a very brief description of this theory, and refer the reader to Pearle [39] fora 
more detailed presentation. The simplest version of the theory is defined by two equations: 
A modified Schrédinger equation, whose solution is: 


Ist = Fen loa [i+ ele)-204P] gy. (17.21) 


where A is the usual Hamiltonian, T is the time-ordering operator, A is a collapse rate 
parameter, and Ais an operator to whose eigenstates the collapse tends. w(t) is a random 
classical function of time, of white noise type, and where the probability of w(¢;) taking 
on a particular value in an interval dw(t;), over the range 0 < t; < fis given by the second 


6 This sort of split into background and perturbation is not trivial due to the “gauge freedom” intrinsic 
to diffeomorphism invariant theories such as GR, but we will mostly ignore such complications in this 
presentation. 
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equation, the probability rule: 


7 _dw(ti) 
PDw(t) = wiv. tl¥. tw |] (17.22) 
1;=0 


| xi fdt 


The deterministic and unitary evolution, as described by the standard Schrédinger 
equation (i.e. the U — for unitary — process in Penrose’s language), and the non- 
deterministic “jump” or collapse of the quantum state that in the textbook approach is 
associated with a measurement (i.e. the R — for reduction — process in Penrose’s language) 
(corresponding to the observable A) are unified. For non-relativistic QM, the regime for 
which the collapse theories were originally developed as a means to resolving the mea- 
surement problem, the proposal involves collapse to the joint eigenstates of a collection of 
commuting operators (essentially the mass density operators at all points of space) rather 
than the simpler collapse to the eigenstates of a single operator A , as described above. This 
entails a classical field, a random function of time at each point of space, rather than the 
simpler single random function of time w(t), as described above. In fact, one needs to use 
a kind of smeared operators, otherwise the localization that results from the collapse will 
involve a generation of infinite energy. See [1] for a discussion. 

It is clear that the parameter 2 must be small enough not to conflict with high precision 
tests of quantum theory in the domain of subatomic physics and big enough to result in 
rapid localization of “macroscopic objects”. GRW suggested a range: A ~ 107 !®s~! (i.e. 
it probably depends on the particle’s mass). 

The point is that collapse theories (such as CSL) can account for the breakdown of 
symmetries in the mini-Mott example, and in the cosmological setting. In particular, the 
original version of CSL, with the smeared position operator for individual particles as the 
universal collapse generating operator, clearly indicates that the first basis is the appropriate 
one to treat the mini-Mott problem, because the detectors are built from particles and their 
localization will lead to states in which the individual detectors will give a well defined 
level of excitation. 

We will thus consider a version of CSL appropriate for the situation at hand. That 
involves considering a version of the theory adapted to quantum field theory, and which in 
principle, given the general relativistic setting of the problem, should also be fully covari- 
ant. We will ignore in this work this second requirement, in the clear understanding that a 
truly satisfactory resolution will need to address that aspect as well. 

As we have already indicated, the space-time metric will be treated classically. We will 
be using a specific gauge (the so-called Newtonian gauge) and we will be ignoring tensor 
perturbations. The perturbed space-time will therefore be described by the metric: 


ds* = a’(n) [-a EON 42h 2W)b;jde'de' | (nx) KX 1 (17.23) 


where W is just the so-called Newtonian potential introduced in Eq. (17.1). We will set 
a = | at the “present cosmological time”, and write and assume that the inflationary regime 
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takes place during the interval - 7 < n < no, with 7 = no, negative and very small in 
absolute terms, and with the scale factor a(n) given below Eq. (17.6). The inflaton scalar 
field # (x) must be treated using QFT in curved space-time (using SSC). The quantum state 
of the scalar field and the space-time metric satisfy Einstein’s semi-classical equation: 
Guy = 81 GET ww). 

As we explained we will be concentrating on the modes other than the zero mode which 
we treat classically as an effective approximation. At the early stages of inflation, which 
we denote by 7 = —7, the state of the scalar field perturbation is described by the Bunch— 
Davies vacuum, and the space-time is 100 per cent homogeneous and isotropic. In fact, in 
the vacuum state, the operators 5x and zx are characterized by Gaussian wave functions 
centered on 0 with uncertainties Ad@; and Az, respectively, when expressed in the eigen- 
basis of either 6d, or zx. 

The role of the collapse dynamics is to modify the quantum state, and generically will 
lead to modified expectation values of 5x (n) and z;(n). In order to proceed, we must now 
specify in some detail the rules governing the collapse. As we have indicated, this is sup- 
posed to be the result of some unknown aspect of physics, which we will here encode into 
an adapted version CSL theory. How can we deal with such a situation? Our approach 
is based on making an “educated guess”, which can later be contrasted with observa- 
tions. The collapse is assumed to take place for each mode independently according to 
the CSL dynamics and determined by independent stochastic functions. We must stress 
that in this approach our universe is viewed as corresponding to one specific realization of 
these stochastic functions (one for each kh). 

The semi-classical Einstein equation we must focus on is (the 7— 17 component of Gy, = 
8 GE|T wv |€) at first order in the perturbations): 

/ 
— PW (nk) = 4 Gb5(m) (6k) = EOD ec, n)) (17.24) 


where VY (7, k) is the Fourier transform of the Newtonian potential, the prime represents 
the derivative with respect to conformal time 7, and the expectation values are taken in 
the quantum state of matter field (in this case the inflaton field) that results from the CSL 
dynamics on the corresponding hypersurface.’ 

As we said, at the start of inflation (7 = —7) state is described by the Bunch—Davies 
vacuum, so the expectation (z (k, n)) vanishes, and thus, as long as the state of the field is 
that vacuum, the space-time will be 100 per cent homogeneous and isotropic. Of course the 
collapse dynamics will modify this unacceptable conclusion. The quantity of direct interest 


7 Tn the actual calculation, we found it convenient to work in the Schrédinger picture where the inflation field is 
time-independent, and the state vector |W, 7) undergoes evolution in conformal time under the joint influence 
of the Hamiltonian and collapse dynamics. In our treatment, it is most convenient to work with the individual 
modes of the inflation field, that is, the Fourier components of (x), 2 (x). These components turn out to satisfy 
essentially the position-momentum commutation relations, and their Hamiltonian turns out to be of a modified 
harmonic oscillator form. However, we proceed with the discussion in the interaction picture, where the state 
changes are due to the CSL dynamics while the field operators evolve with the free Hamiltonian. 
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to us is: Ata 1 
STO) a ying Sy = cf aket 355 (Kn). (17.25) 


where c = - am. Here, x is a point on the intersection of our past light cone with 
the last scattering surface (7 = np), and corresponds to the direction on the sky speci- 
fied by 6,g. Thus, as follows from Eq. (17.3), the expression for the quantity of direct 
observational interest is: 


Om = C PQY* (0,9) / da ke** a (2 (k, np)). (17.26) 
There is nothing analogous to this expression in the standard approaches! The expres- 
sion above shows that the quantity of interest can be thought of as a result of a “random 
walk” on the complex plane. That is, the quantity of interest is the sum (represented by the 
second integral above) of many contributions (in this case infinite), one for each value of k, 
where every individual contribution involves some random variable (the one determining 
the post-collapse value of (7 (k, 7)). As is always the case one cannot predict the end point 
of such a random “walk”; however, one can focus on the most likely value of magnitude 
of the total displacement. It follows from Eq. (17.26) that the quantity corresponding to 
magnitude of the total displacement for one realization of the collapse dynamics (the one 
corresponding to that which took place in our universe) is: 


lovin |? = (Azrc)? ; Pkd?K ji(kRp) jk Ro) Yim Yi; (K) 


xa tt ye my (17.21) 

Let us recall that we need the product of the expectation values and not the expectation 

value of the product!!® Now we proceed to estimate the most likely value of the quantity 

above, through the use of an imaginary ensemble of realizations and identifying (for the 

sake of computational ease) the most likely value with the ensemble average. We need to 

compute the ensemble average at “late times’. One can show that generically we will have: 
((zt (k, 7)) (7 (k’, n))*) = f (k)6(k — k’). Therefore, from Eq. (17.27), we obtain, 


—; - 1 
lotm|2 = (4zrc)* [ dkj(KRp)” = fF (h). (17.28) 


At this point it should be noted that in the present analysis the essentially flat spectrum, 
which is known to fit well with observations, would correspond to f(k) being proportional 
to k. 


8 The latter is what one finds in the standard treatments. The difference is very significant among other things 
because the latter is generically non-zero even when the state of the quantum field is homogeneous and 
isotropic, while the former will be non-vanishing only if the quantum state does not have those symmetries. 
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In order to proceed further, and study what our theoretical approach says regarding the 
function f(k) above, we need to consider the theory controlling the dynamics of the col- 
lapse. As we said, we will here rely on CSL theory; however, in order to do that we still 
need to choose the operator A driving the collapse and the value of the parameter A. 

It is convenient to work with a re-scaled field y(y,x) = ad@(n,xX) and its momentum 
conjugate (7, x) = ad¢'(n, x). For simplicity, we put everything in a box of size L (to be 
removed at the end of the calculations), and focus on a single mode k. We thus write: 


Y=Qn/L??yq,0, Ms 2x/L)/?2y(7,h). (17.29) 


AS we saw, in order to compare with the observations, we need to evaluate the ensemble 


average (I 1), and determine under what circumstances, if any, this is ~ k. Recall that 


we must consider the quantity (I] 1)? rather than the quantity (112) that is considered in 
standard treatments. This is on the one hand, what we need to compute is the most likely 
magnitude of the coefficients a, and not simply the quantum mechanical uncertainty of 
that quantity in the relevant quantum state. We can see that this is the case, among other 
reasons, because a non-vanishing value of the first clearly indicates a lack of isotropy in 
the CMB, while a non-vanishing value of the second implies no such thing. This is just 
as in the case of a harmonic oscillator (with potential V = Kx), where any state that is 
symmetric under x — —x has vanishing (X), while even the symmetric ground state has 
non-vanishing (x). 

The explicit calculations are a little bit involved and the interested reader is directed 
to [4]. 

We start by considering the case where Il is taken as the collapse-generating operator 
toward one of whose eigenstates the state vector evolves (i.e. as the operator that acts as 
the generator of collapse). That is setting A = Tl. In this case we obtain: 


MPT Ok 


k 
+ 
2002 Vai 4+ V1 +4422 


(Ii)? = (17.30) 


We note that if we set the collapse rate Batamicie A=0 (om off CSL), we would have 
the standard quantum mechanics result (II 1)2 = 0 since (II ]) = = 0. This is as it should be: 
If there is no mechanism for breaking the initial symmetry, then the value of a measure of 
the departure from such symmetry should be zero. 

We further see that agreement with the observed scale-invariant spectrum (i.e. a result 
that to a good approximation is simply proportional to k) can be achieved if we assume 
the first term is dominant and we set 4 = A /k, where i is a constant (independent of k). 
We note that this replaces the dimensionless collapse rate parameter A with parameter i of 


dimension time~!. 
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In that case we obtain: 


_ ART Lk k 


a 3 
Ji 14+ f1+ 4G 02 


We note that although we found an expression corresponding almost to a scale invariant 
spectrum, our analysis indicates a particular type of modification that can in principle be 


(11)? ava 


searched for in the observational data. 

Analogously, we have considered the case where Y is taken as generator of collapse and 
we obtained very similar, yet different, results. For details see [4]. 

Finally, the comparison and adjustment with the overall amplitude of the observed CMB 
anisotropies, using the GUT scale for the value of inflation potential, and other standard 
values for the slow-roll parameters (order of a few per cent), we are led to the estimate: 
1 ~ 10 >Mpc~! ~ 107!%s~!. It is noteworthy that this value is not very different from 
GRW suggestion of 10~!°s~!, although these values apply in quite different settings: the 
former in applying CSL to everyday non-relativistic physics, the latter in applying CSL to 
the cosmological situation at hand. Moreover, we should note the relatively wide range of 
viable values of the parameter in the non-relativistic quantum settings and in particular its 
likely dependence on the particle’s mass. 


17.6.4 Collapse on Field Operators 


One might be a bit concerned with the fact that in our treatment we have assumed that the 
collapse occurs mode by mode, given the fact that these quantum field modes are extended 
over the whole universe. The non-local aspect of this collapse suggests some possible con- 
flict with relativistic causality. However, a moment’s thought indicates that this is not very 
different from what occurs within any collapse theory treatment of something like an EPR 
situation. The collapse theories are known to work fine in these situations and imply no 
faster than light signaling despite their non-local aspects. After all, we all know that any 
successful description of the experimental test of Bell’s inequalities implies some level of 
non-locality [53]. 

One way we can see that, in principle, this should be no serious obstacle for the approach 
we have taken is by expressing the collapse dynamics in terms of local operators. Thus, we 
next write the version of the CSL theory we have been led to, when described in terms of 
the space-time field operators. In one case we can start by defining 


j@) = / Bo 2 yh) = (-Vy"/45Q), (17.32) 


1 
(27 )3/2 
where we recall that y = ad@. Then the state vector evolution is given by 


1 


rn WW n } 3 ef 9 3)12 
er er ee ea Ca ae, oY (17.33) 
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This is just the standard CSL state-vector evolution, where the collapse-generating 
operators (toward whose joint eigenstates collapse tends) are ¥(x) for all x. 
Similarly, in the case where we take IT as the generator of collapse we have 


jr m7 lp , Bs, + nl\ 99 = (12 
lv.n) =Te if" dn'H ZG [lpan [ PxtwGn')-207 @)] \v,—T). (17.34) 


where 7 (x) = (—V7)~!/47 (%). The above equation describes just the standard CSL state- 
vector evolution, where the collapse-generating operators (toward whose joint eigenstates 
collapse drives all states) are 7 (X) for all x. 

Thus we have seen that the distribution of matter in our actual universe and the observed 
CMB radiation which follows from it can be explained by the hypothesis of CSL dynami- 
cal collapse acting on the state vector describing inflation fluctuations in the early universe. 
Indeed, we have seen that can be done in two possible ways. However, it has turned out 
that the collapse-generating operator which gives such results is not the inflation fluctua- 
tion itself y(x) or its conjugate momentum z (x), but rather a peculiar differential operator 
acting on these, either (=V7)- At (x) or (=V7)/" 45%). A satisfactory explanation will 
have to wait for a general theory expressing, in all situations, from particle physics to cos- 
mology, the exact form of the CSL-type of modification to the evolution of quantum states. 
Such generic theory would likely involve gravitation playing a fundamental role. 


17.7 Discussion and Conclusions 


We have reviewed the standard account of the generation of the seeds of cosmic structure 
from quantum fluctuations in inflationary cosmology. We have argued that that account 
lacks any element that could be used to explain the necessary breakdown in the homogene- 
ity and isotropy of the vacuum state characterizing the quantum field. The issue is related to 
the interpretation of quantum mechanics, and, in particular, to the so-called “measurement 
problem”. The discussion of the conceptual issues was facilitated here by considering, 
within the same conditions associated with the cosmological problem, a simplified version 
of an early analysis by Mott of the quantum decay of a nucleus in a spherically symmetric 
initial state leading to the well known straight traces in a bubble chamber: the mini-Mott 
problem. The idea was to use this model as a test ground to examine any proposal for the 
resolution of the main problem. 

We have considered briefly the answers to the above conundrum that are most com- 
monly offered by inflationary cosmologists. We have examined these answers and have 
argued that they are clearly unsatisfactory, basically from the start. That is simply because 
in the cosmological context we need to account for the emergence of the primordial inho- 
mogeneities and anisotropies which are the seeds of all cosmic structure including galaxies, 
stars, planets, life, humans (or other sapient beings), and instruments. Therefore, we have 
to provide arguments that do not rely on instruments or observers at the stage where we 
want to understand the emergence of the primordial inhomogeneities. 
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We have noted that the application of ideas based on decoherence requires the prior iden- 
tification of an environment, something that generically depends on our particular interests 
or measuring capabilities, and should therefore be off limits, for the use in an explanatory 
context, in the situation at hand. Moreover, we have argued that the symmetry of the situ- 
ation ensures that the decoherence could only lead to reduced density matrices of a form 
that does not allow for the unique selection of the so-called pointer basis. 

In the case of many worlds interpretations we argued that the various schemes possess no 
elements allowing one to select the basis, or context in which the ensemble of alternatives 
must be described, or in which the “splitting of the world” takes place (or any other char- 
acterization of the passing from the one world to the many worlds that characterize these 
proposals). That is, by making different choices of basis, either in the mini-Mott case or in 
the cosmological situation, one could be led to argue in favor or against the breakdown of 
the initial symmetry. 

Something similar happens with the approaches such as consistent histories proposal. 
There is simply nothing like an unambiguous rule indicating which basis, context or 
“realm” to consider, and depending on the choice one could end assigning non-vanishing 
probabilities to non-symmetric histories, or an exactly vanishing probability for all except 
the symmetric ones. 

One idea that we did not consider in this chapter is that provided by the de Broglie— 
Bohm (dBB) version of quantum theory. In that case, the physical situation is described 
not only by the wave function, but also by a set of configuration variables, and thus it is 
perfectly possible to have a system characterized by a symmetric wave function, whose 
state is nonetheless not fully symmetric. That is, the asymmetry can reside in the con- 
figuration variable. It is thus possible to consider dBB as a viable scheme to address our 
problem. In fact, various works based on the application of dBB theory to the inflation- 
ary situation have been published in recent years [60, 61]. However, when applying dBB 
to the cosmological case at hand one can not really argue that it describes in the cosmo- 
logical context the emergence of the primordial inhomogeneities, simply because within 
this approach the symmetry was never there to start with. That is, even though the initial 
wave function is symmetric the initial values for the configuration variables were not, in 
the individual cases, initially symmetric (this despite the fact that one might consider an 
ensemble of realizations, and argue that at the statistical level the distribution of initial data 
was symmetric). 

Thus, we conclude that none of the existing frameworks for quantum theory (with the 
possible exception of the dBB proposal, with the caveat explained above) seems to offer a 
satisfactory account leading to the desired breaking of the initial symmetry in the problems 
at hand, and leading to what we think are the appropriate characterizations of the late time 
situations where the symmetry is gone. 

The analysis presented here indicates the attractiveness of incorporating something like 
the collapse of the wave function as a spontaneous dynamical aspect of nature, something 
that has been long advocated by some colleagues working on addressing the measurement 
problem in quantum theory. We have shown here how one can adapt one of the promising 
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versions of such theories, the CSL proposal, to address the cosmological problem that 
motivated this research path. Of course, a completely satisfactory theory is still lacking, 
as that would need not only a general formulation which would work in situations ranging 
from the few particle non-relativistic quantum mechanics to which the initial GRW or 
CSL proposals were devoted, to the cosmological situations such as the the one explored 
here, but will have to be a theory that is shown to be fully compatible with special and 
general relativity. In the meanwhile, we feel the analysis we have presented serves as a 
first set of steps in the explicit parametrization of the still mysterious aspect of physics 
that we have argued is missing from our understanding of the workings of nature and 
that is exhibited in a particularly blatant form in the context of cosmology. It is worthwhile 
mentioning that initial steps in the exploration of further applications of dynamical collapse 
theories indicate [33, 35, 36] they can be helpful in addressing other problems appearing 
at the interface of the quantum and gravitational realms, namely the so-called “black hole 
information paradox” and “the problem of time” in quantum gravity. On the other hand, 
we should accept the possibility that the approach based on non-unitary modifications of 
quantum theory that we have been exploring in this context, will be shown to be non- 
viable. However, even in that case, our analysis would have served to pinpoint a serious 
shortcoming in our understanding of the physical world. In any event, it seems quite clear 
that the research into these matters must continue. 
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18 
Towards a Novel Approach to Semi-Classical Gravity 


WARD STRUYVE 


18.1 Introduction 


Quantum gravity is often considered to be the holy grail of theoretical physics. One 
approach is canonical quantum gravity, which concerns the Wheeler—DeWitt equation and 
which is obtained by applying the usual quantization methods (which were so successful 
in the case of high energy physics) to Einstein’s field equations. However, this approach 
suffers from a host of problems, some of technical and some of conceptual nature (such as 
finding solutions to the Wheeler—DeWitt equation, the problem of time, ...). For this rea- 
son one often resorts to a semi-classical approximation where gravity is treated classically 
and matter quantum mechanically [1, 2]. The hope is that such an approximation is easier 
to analyse and yet reveals some effects of quantum gravitational nature. 

In the usual approach to semi-classical gravity, matter is described by quantum field the- 
ory on curved space-time. For example, in the case the matter is described by a quantized 
scalar field, the state vector can be considered to be a functional Y(¢) on the space of 
fields, which satisfies a particular Schrédinger equation 


i0,Y(o,1) = Ag, g)¥(o.0), (18.1) 


where the Hamiltonian operator H depends on the space-time metric g. This metric satisfies 
Einstein’s field equations 


Gyv(g) = 81 GY |Tv ($, 9)1Y) 5 (18.2) 


where the source term is given by the expectation value of the energy-momentum tensor 
operator. 

This semi-classical approximation of course has limited validity. For example, it will 
form a good approximation when the matter state approximately corresponds to a classical 
state (i.e. a coherent state), but will fail to be so when the state is a macroscopic super- 
position of such states. Namely, for such a superposition W = (W; + W2)/./2, we have 
(WIT yvlY) © (V1 |Tuv| 1) + (Y2|Tzv|%2)) /2, so that the gravitational field is affected 
by two matter sources, one coming from each term in the superposition. However, one 
expects that according to a full theory for quantum gravity, the states |W,) and |W2) each 
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have their own gravitational field and that the total state is a superposition of those. And, 
indeed, Page and Geilker showed with an experiment that this semi-classical theory is not 
adequate [2, 3]. 

Of course, as already noted by Page and Geilker, it could be that this problem is not due 
to the fact that gravity is treated classically, but due to the choice of the version of quantum 
theory. Namely, Page and Geilker adopted the many worlds point of view, according to 
which the wave function never collapses. However, according to standard quantum theory 
the wave function is supposed to collapse during a measurement. Which physical pro- 
cesses act as measurements is of course rather vague and is the source of the measurement 
problem. But it could be that such collapses explain the outcome of their experiment. If 
an explanation of this type is sought, one should consider so-called spontaneous collapse 
theories, where collapses are objective, random processes that do not in a fundamental 
way depend on the notion of measurement. (See Diez-Tejedor and Sudarsky [4] and Der- 
akhshani [5] for actual proposals combining such a spontaneous collapse approach with, 
respectively, Eq. (18.2) and its non-relativistic version.) 

In this chapter, we consider an alternative to standard quantum mechanics, called 
Bohmian mechanics [6-9] and develop a semi-classical approximation based on it which 
is expected to improve upon the usual approach. 

Bohmian mechanics solves the measurement problem by introducing an actual config- 
uration (particle positions in the non-relativistic domain, particle positions or fields in the 
relativistic domain [11]) that evolves under the influence of the wave function. According 
to this approach, instead of coupling classical gravity to the wave function, it is natural to 
couple it to the actual matter configuration. For example, in the case of a scalar field there 
is an actual field ¢g whose time evolution is determined by the wave functional Y. There 
is an energy-momentum tensor 7), (¢g, g) corresponding to this scalar field and this tensor 
can be introduced as the source term in Einstein’s field equations: 


Guv(g) — Tv (B, 8) - (18.3) 


This approach immediately solves the problem with the macroscopic superposition, since 
the energy-momentum tensor will correspond to just one of the macroscopic matter 
distributions. 

However, there is an immediate problem with this approach, namely that Eq. (18.3) is 
not consistent. The Einstein tensor G,,, is identically conserved, i.e. V“Gyy = 0. So the 
Bohmian energy-momentum tensor 7, (6g, g) must be conserved as well. However, the 
equation of motion for the scalar field does not guarantee this. (Similarly, in the Bohmian 
approach to non-relativistic systems, the energy is generically not conserved.) 

As explained in Struyve [12], the root of the problem seems to be the gauge invariance, 
which in this case is the invariance under spatial diffeomorphisms. Because the scalar field 
and the space-time metric are connected by spatial diffeomorphisms, it seems that one can 
not just assume the metric to be classical without also assuming the scalar field dg to be 
classical (in which case the energy-momentum tensor is conserved). 
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A similar problem arises when we consider a Bohmian semi-classical approximation to 
scalar electrodynamics, which describes a scalar field interacting with an electromagnetic 
field. In this case, the wave equation for the scalar field is of the form 


id, U(¢, 1) = H(d, A)V(¢, 0), (18.4) 


where A is the vector potential. There is also a Bohmian scalar field @g and a charge current 
J’ (bp, A) that could act as the source term in Maxwell’s equations 


OpFh" (A) = 7" (bp. A), (18.5) 


where F” is the electromagnetic field tensor. In this case, we have 0,0,,F"" = 0 due to 
the anti-symmetry of F*”. As such, the charge current must be conserved. However, the 
Bohmian equation of motion for the scalar field does not imply conservation. Hence, just 
as in the case of gravity, a consistency problem arises. As explained in Struyve [12], this 
problem can be overcome by eliminating the gauge invariance, either by assuming some 
gauge fixing or (equivalently) by working with gauge-independent degrees of freedom. In 
this way, we can straightforwardly derive a semi-classical approximation starting from the 
full Bohmian approach. For example, in the Coulomb gauge, the result is that there is an 
extra current Jo which appears in addition to the usual charge current and which depends 
on the quantum potential, so that Maxwell’s equations read 


Oy Fh" (A) = 7" (bp. A) + jo(p. A) - (18.6) 


While it is easy to eliminate the gauge invariance in the case of electrodynamics, this 
is notoriously difficult in the case of general relativity. One can formulate a Bohmian 
approach for the Wheeler-DeWitt equation for a scalar matter field interacting with 
gravity, but the usual formulation does not explicitly eliminate the gauge freedom aris- 
ing from spatial diffeomorphism invariance. Our expectation is that one could find a 
semi-classical approximation given such a formulation. At least we find our expectation 
confirmed in simplified models, called mini-superspace models, where this invariance 
is eliminated. We will illustrate this for the model described by the homogeneous and 
isotropic Friedmann—Lemaitre—Robertson—Walker (FLRW) metric and a uniform scalar 
field. 

In this chapter, we are merely concerned with the formulation of Bohmian semi-classical 
approximations. Practical applications will be studied elsewhere. Such applications already 
have been studied in the context of quantum chemistry, see Section 18.3. It appears that 
Bohmian semi-classical approximations yield better or equivalent results compared to the 
usual semi-classical approximation. (They are better in the sense that they are closer to 
the exact quantum results.) This provides good hope that also in other contexts, such as 
quantum gravity, the Bohmian approach also gives better results. Potential applications 
might be found in inflation theory, where the back-reaction from the quantum fluctuations 
onto the classical background can be studied, or in black hole physics, to study the back- 
reaction from the Hawking radiation onto space-time. 


A Novel Approach to Semi-Classical Gravity 359 


The outline of the chapter is as follows. After introducing Bohmian mechanics in 
Section 18.2, we will discuss how to derive a Bohmian semi-classical approximation 
in the context of non-relativistic quantum mechanics. Semi-classical approximations to 
other quantum theories can be derived in a similar way. We present such approximations 
for scalar quantum electrodynamics in Section 18.4 and for a mini-superspace model in 
Section 18.5. More examples and details can be found in Struyve [12]. 


18.2 Bohmian Mechanics 
18.2.1 Non-Relativistic Quantum Mechanics 


Non-relativistic Bohmian mechanics (also called pilot-wave theory or de Broglie-—Bohm 
theory) is a theory about point-particles in physical space moving under the influence of 
the wave function [6-9]. The equation of motion for the configuration X = (Xj,..., Xn) 
of the particles is given by! 


X(t) = oY (XO.0, (18.7) 
where v¥ = (, ae wy), with 
1 V 1 
v= Im( it) = —V;S (18.8) 
mk vu mk 
and yw = lwlels. The wave function w(x,t) = w(x1,...,X,) itself satisfies the 


non-relativistic Schrédinger equation 


n 


idx, 1) = (- y. Vv} dh ve] w(x, 0). (18.9) 


2m 
k=1 k 


For an ensemble of systems all with the same wave function y, there is a distinguished 
distribution given by |y|?, which is called the quantum equilibrium distribution. This dis- 
tribution is equivariant. That is, it is preserved by the particles dynamics Eq. (18.7) in 
the sense that if the particle distribution is given by |¥(x, fo)|* at some time fo, then it is 
given by |y(x,2)/? at all times ¢. This follows from the fact that any distribution p that is 
transported by the particle motion satisfies the continuity equation 


n 
ap + > \Vi- (ve p) =0 (18.10) 
k=1 


and that |y|? satisfies the same equation, i.e. 


awl + > Ve- (vel?) =0, (18.11) 


k=1 


! Throughout the chapter we assume units in which f = c = 1. 
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as a consequence of the Schrédinger equation. It can be shown that for a typical initial 
configuration of the universe, the (empirical) particle distribution for an actual ensemble 
of subsystems within the universe will be given by the quantum equilibrium distribution 
[8, 9, 10]. Therefore for such a configuration Bohmian mechanics reproduces the standard 
quantum predictions. 


Note that the velocity field is of the form jY¥/|w|?, where j¥ = G", sh i) with 
i = Im(w*Vx)/m, is the usual quantum current. In other quantum theories, such as 
for example quantum field theories, the velocity can be defined in a similar way by divid- 
ing the appropriate current by the density. In this way equivariance of the density will be 
ensured. (See Stuyve and Valentini [13] for a treatment of arbitrary Hamiltonians.) 

This theory solves the measurement problem. Notions such as measurement or observer 
play no fundamental role. Instead measurement can be treated as any other physical 
process. 

There are two aspects of the theory that are important for deriving the semi-classical 
approximation. Firstly, Bohmian mechanics allows for an unambiguous analysis of the 
classical limit. Namely, the classical limit is obtained whenever the particles (or at least the 
relevant macroscopic variables, such as the center of mass) move classically, i.e. satisfy 
Newton’s equation. By taking the time derivative of Eq. (18.7), we find that 


meXk(t) = —Ve (V(x) + OY (x, 2) eee (18.12) 
where ; 
“1 Vilyvl 
I ea) eee 18.13 
Y=) om Wl aes 


is the quantum potential. Hence, if the quantum force —V,Q” is negligible compared 
to the classical force —V;V, then the kth particle approximately moves along a classical 
trajectory. 

Another aspect of the theory is that it allows for a simple and natural definition for 
the wave function of a subsystem [8, 10]. Namely, consider a system with wave function 
w(x, y) where x is the configuration variable of the subsystem and y is the configuration 
variable of its environment. The actual configuration is (X, Y), where X is the configuration 
of the subsystem and Y is the configuration of the other particles. The wave function of the 
subsystem x (x, t), called the conditional wave function, is then defined as 


x,t) = vo, Y(O,0. (18.14) 
This is a natural definition since the trajectory X(t) of the subsystem satisfies 
X(t) = vY (XO, YO, = vX (X(N), 0). (18.15) 


That is, for the evolution of the subsystem’s configuration we can either consider the con- 
ditional wave function or the total wave function (keeping the initial positions fixed). (The 
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conditional wave function is also the wave function that would be found by a natural opera- 
tionalist method for defining the wave function of a quantum mechanical subsystem [14].) 
The time evolution of the conditional wave function is completely determined by the time 
evolution of yy and that of Y. This implies that the conditional wave function does not 
necessarily satisfy a Schrédinger equation, although in many cases it does. This wave 
function collapses according to the usual text book rules when an actual measurement is 
performed. 


18.2.2. Quantum Field Theory 


We will also consider semi-classical approximations to quantum field theories. More 
specifically, we will consider bosonic quantum field theories. In Bohmian approaches to 
such theories it is most easy to introduce actual field variables rather than particle positions 
{11, 15]. To illustrate how this works, let us consider the free massless real scalar field (for 
the treatment of other bosonic field theories see Struyve [15]). Working in the functional 
Schrédinger picture, the quantum state vector is a wave functional Y (@) defined on a space 
of scalar fields in 3-space and it satisfies the functional Schrédinger equation 


1 32 
id, U(d,t) = 5 fe (= + V(x) - Voix) W(d, 1). (18.16) 


The associated continuity equation is 


arwe.nP + f ax id (Se 


; W(¢,p)/?) =0, 18.17 
soos (ape t@n?) iagks 


where YW = |W|e!5. This suggests the guidance equation 


SENCAD 
P= Tom 


(18.18) 


(x)= (x,t) 


(Note that in this case we did not distinguish notationally the actual field variable from the 
argument of the wave functional.) Taking the time derivative of this equation results in 


50” (9,1) 
bajo oO (18.19) 
8$(X)  lom=oxs) 
where 
gf fe 5*|w| 18.20) 
OY =— Taf Oper “ 


where Q™ is the quantum potential. The classical limit is obtained whenever the quantum 
force, i.e. the right hand side of Eq. (18.19), is negligible. Then the field approximately 
satisfies the classical field equation Llé = 0. 
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One can also consider the conditional wave functional of a subsystem. A subsystem can 
in this case be regarded as a system confined to a certain region in space. The conditional 
wave functional for the field confined to that region is then obtained from the total wave 
functional by conditioning over the actual field value on the complement of that region. 
However, in the following we will not consider this kind of conditional wave functional. 
Rather, there will be other degrees of freedom, like for example other fields, which will be 
conditioned over. 

This Bohmian approach is not Lorentz invariant. The guidance Eq. (18.18) is formu- 
lated with respect to a preferred reference frame and as such violates Lorentz invariance. 
This violation does not show up in the statistical predictions given quantum equilibrium, 
since the theory makes the same predictions as standard quantum theory which are Lorentz 
invariant.” The difficulty in finding a Lorentz invariant theory resides in the fact that any 
adequate formulation of quantum theory must be non-local [16]. One approach to make 
the Bohmian theory Lorentz invariant is by introducing a foliation which is determined by 
the wave function in a covariant way [17]. In this chapter, we will not attempt to main- 
tain Lorentz invariance. As such, the Bohmian semi-classical approximations will not be 
Lorentz invariant, (very likely) not even concerning the statistical predictions. This is in 
contrast with the usual approach like the one for gravity given by Eq. (18.1) and Eq. (18.2) 
which is fully Lorentz invariant. However, this does not take away the expectation that 
the Bohmian semi-classical approximation will give better or at least equivalent results 
compared to the usual approach. 


18.2.3: Quantum Gravity 


In canonical quantum gravity, the state vector is a functional Y(h, @) on the space of 3- 
metrics /j;(x) on a three-dimensional (3D) manifold and fields ¢ (x) (in the case the matter 
is described by a quantized scalar field). The wave functional is static and merely satisfies 
the constraints [2] 


HW (h, ) = 0, (18.21) 
Hiv (h, b) = 0. (18.22) 


Their explicit forms are not important here. The latter constraint expresses the fact that the 
wave functional is invariant under infinitesimal diffeomorphisms of 3-space. The former 
equation is the Wheeler—DeWitt equation. It is believed that this equation contains the 
dynamical content of the theory. However, it is as yet not clear how this dynamical content 
should be extracted. This is the problem of time [2, 18]. 


2 Actually, this statement needs some qualifications since regulators need to be introduced to make the theory 
and its statistical predictions well defined [15]. 
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In the Bohmian approach, there is an actual 3-metric and a scalar field, whose dynamics 
depends on the wave functional [19-21]. The dynamics expresses how the Bohmian config- 
uration changes along a succession of 3D space-like surfaces. Although the wave function 
is stationary, the Bohmian configuration will change along these surfaces for generic wave 
functions. This is how the Bohmian approach solves the problem of time. 

Some cosmological applications of the Bohmian approach to quantum gravity are the 
explanation of the quantum-to-classical transition in inflation theory [22, 23] and the study 
of space-time singularities [24—26]. 


18.3, Non-Relativistic Quantum Mechanics 
18.3.1 Usual Versus Bohmian Semi-Classical Approximation 


Consider a composite system of just two particles. The usual semi-classical approach 
(also called the mean-field approach) goes as follows. Particle 1 is described quantum 
mechanically, by a wave function x (x1, ¢), which satisfies the Schrédinger equation 


1 
10;X (1,0) = -5,-%7 + vox1,X0(0)| x(X1,0), (18.23) 


where the potential is evaluated at the position of the second particle X, which satisfies 
Newton’s equation 
x) 


— : ax Ix (x1, D/°[-WaV(x1,%2)]| 


mX(t) = — (x V2VR1 XD) oaxrc9 


sclteeg! (18.24) 
So the force on the right hand side is averaged over the quantum particle. 

An alternative semi-classical approach based on Bohmian mechanics was proposed 
independently by Gindensperger et al. [27] and Prezhdo and Brooksby [28]. In this 
approach there is also an actual position for particle 1, denoted by X,, which satisfies 
the equation 


Xi(t) = v* (Xi (0,9), (18.25) 

where ‘ = 
v= —Im—~, (18.26) 

mM, x 


and where x satisfies the Schrédinger equation (18.23). But instead of Eq. (18.24), the 
second particle now satisfies 


mXo(t) = —V2V(X1(),x2)|,, (18.27) 


=X2(1)’ 


3 The succession of the surfaces is determined by the lapse function and different choices of lapse function 
lead to different Bohmian dynamics. This determines that the dynamics is not invariant under space-time 
diffeomorphisms. Again, the root of the problem is the non-locality of quantum theory. 
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where the force depends on the position of the first particle. So in this approximation the 
second particle is not acted upon by some average force, but rather by the actual particle of 
the quantum system. This approximation is therefore expected to yield a better approach 
than the usual approach, in the sense that it yields predictions closer to those predicted 
by full quantum theory, especially in the case where the wave function evolves into a 
superposition of non-overlapping packets. This is indeed confirmed by a number of studies, 
as we will discuss below. 

Let us first mention some properties of this approximation and compare them to the 
usual approach. In the mean field approach, the specification of an initial wave function 
x (X1, fo), an initial position X2(¢o) and velocity X> (to) determines a unique solution for the 
wave function and the trajectory of the classical particle. In the Bohmian approach also the 
initial position X, (fo) of the particle of the quantum system needs to be specified in order to 
uniquely determine a solution. Different initial positions Xj (fo) yield different evolutions 
for the wave function and the classical particle. This is because the evolution of each of 
the variables X;, X2, x depends on the others. Namely, the evolution of x depends on X2 
via Eq. (18.23), whose evolution in turn depends on X; via Eq. (18.27), whose evolution 
in turn depends on x via Eq. (18.25). (This should be contrasted with the full Bohmian 
theory, where the wave function acts on the particles, but there is no back-reaction from 
the particles onto the wave function.) 

The initial configuration X,(fo) should be considered random with distribution 
Ix (X1, to)|?. However, this does not imply that X,(‘) is random with distribution 
|x (X1,0|? for later times ft. It is not even clear what the latter statement should mean, 
since different initial positions Xj (to) lead to different wave function evolution; so which 
wave function should x (Xj, f) be? 

This semi-classical approximation has been applied to a number of systems. Prezhdo 
and Brookby studied the case of a light particle scattering off a heavy particle [28]. They 
considered the scattering probability over time and found that the Bohmian semi-classical 
approximation was in better agreement with the exact quantum mechanical prediction than 
the usual approximation. The Bohmian semi-classical approximation gives probability one 
for the scattering to have happened after some time, in agreement with the exact result, 
whereas the probability predicted by the usual approach does not reach one. The reported 
reason for the better results is that the wave function of the quantum particle evolves into a 
superposition of non-overlapping packets, which yields bad results for the usual approach 
(since the force on the classical particle contains contributions from both packets), but 
not for the Bohmian approach. These results were confirmed and further expanded by 
Gindensperger et al. [29]. Other examples have been considered [27, 30, 31]. In those 
cases, the Bohmian semi-classical approximation gave very good agreement with the exact 
quantum or experimental results. It was always either better or comparable to the usual 
approach. These results give good hope that the Bohmian semi-classical approximation 
will also give better results than the usual approximation in other domains such as quantum 
gravity. 


A Novel Approach to Semi-Classical Gravity 365 


18.3.2 Derivation of the Bohmian Semi-Classical Approximation 


The Bohmian semi-classical approach can easily be derived from the full Bohmian the- 
ory.* Consider a system of two particles. In the Bohmian description of this system, we 
have a wave function (x1, X2, 7) and positions Xj (4), X2(7), which respectively satisfy the 
Schrédinger equation 


1 1 
id; = | -——V? — — V2 + Vix. x2) |v (18.28) 
2m 2m? 
and the guidance equations 
Xi) = VP (X1),X20.9, Xa) = v¥ (Ki, Xa.) (18.29) 
The conditional wave function x(x},f) = w(x, X2(4),0) for particle 1 satisfies the 
equation 
vi} 
10;X (X10) = 7 +V(x1,X0(0)) | x1. +110), (18.30) 
where 
V2 v 
1(x1,t) = | -~— (x1, x2, 0) + iVoW (x1, X2,2) “Vy (Xi (t), Xo(t), 2). 
2m =X (1) x9 =X>(t) 


(18.31) 
So in case J is negligible in Eq. (18.30), up to a time-dependent factor times x,> we are 
led to the Schrédinger equation (18.23). This will, for example, be the case if m2 is much 
larger than m (J is inversely proportional to m2) and if the wave function slowly varies as 
a function of x2. We also have 


m X(t) = —V2[V(X1(1), x2) + OY (Ki (), x2, )] ; (18.32) 
x2=X2(t) 


with Q” the quantum potential. We obtain the classical Eq. (18.27), if the quantum force 
is negligible compared to the classical force. 

In this way we obtain the equations for a semi-classical formulation. In addition, we 
also have the conditions under which they will be valid. For other quantum theories, 
such as quantum gravity, we can follow a similar path to find a Bohmian semi-classical 
approximation. 


4 The derivation is very close to the one followed by Gindensperger et al. [27]. A difference is that they also let 
the wave function of the quantum system depend parametrically on the position of the classical particle. This 
leads to a quantum force term in the Eq. (18.27) for particle 2. However, this does not seem to lead to a useful 
set of equations. In particular, they can not be numerically integrated by simply specifying the initial wave 
function and particle positions. In any case, Gindensperger et al. drop this quantum force when considering 
examples [27, 29, 30], so that the resulting equations correspond to the ones presented above. 

If 7 contains a term of the form f(t) x, then it can be eliminated by changing the phase of x by a time-dependent 
term. 
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18.4 Scalar Electrodynamics 


We consider scalar electrodynamics to illustrate the issues with developing a Bohmian 
semi-classical approximation for a gauge theory. There are various equivalent ways of 
formulating the Bohmian approach [12]. These formulations can either be found by con- 
sidering different gauges or by working with different choices of gauge-independent 
variables. Here, we will consider two examples of gauges, namely the temporal gauge, 
which is an incomplete gauge fixing, and the Coulomb gauge, which completely fixes the 
gauge symmetry. Using the former gauge, we are not immediately led to a semi-classical 
approximation, due to the remaining gauge freedom, while we are using the latter gauge. 

In classical scalar electrodynamics, the equations of motion for the scalar field ¢ and the 
vector potential A* = (Ao, A) are 


DyD"o+m'o=0,  d,F*” =}, (18.33) 


where Dy, = 0, +ieA, is the covariant derivative, F“” = 0“A” —0°A" the electromagnetic 
field tensor and 
5 ie =ie (¢*D’¢ — oD"*¢*) (18.34) 


is the charge current. The theory has a local gauge symmetry 
(32%, AMS AM = ole, (18.35) 


One possible choice of gauge is the temporal gauge Ag = 0. It does not completely 
fix the gauge; there is still a residual gauge symmetry given by the time-independent 
transformations 

> lg, A>A+V6O, (18.36) 


with 6 = 0. Quantization in this gauge leads to the following functional Schrédinger 
equation for W(@, A, t) [32]: 


62 1 52 1 
ia = | d?x( — —— + |Dé|* + m?|¢|* — -—~. + =(V x A)* )W, 18.37 
10; i +( sprap Del +m*|¢| 5 ear x A) ( ) 


together with the constraint 
jw OW ow 
V-—+ie(¢ —¢@—)=0. (18.38) 


The constraint expresses the fact that the wave functional is invariant under time- 
independent gauge transformations, i.e. Y(~,A) = W(e'’d,A + VO), with 6 time- 
independent. The constraint is compatible with the Schrédinger equation: if it is satisfied 
at one time, it is satisfied at all times. 


6 The wave functional should be understood as a functional of the real and imaginary part of @. In addition, 
writing d = (@- + ip) /V2, we have that the functional derivatives are given by 5/d@ = (6/56; — i5/5;)/V2 
and 6/d@* = (5/5dy + i8/59;)/V2. 
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In the Bohmian approach [33], there are actual configurations @ and A that satisfy the 
guidance equations 


. os : 6S 
= , A=—, 18.39 
vy) ag <A ( ) 
where YW = |Wlel®, These equations are invariant under the time-independent gauge 


transformations Eq. (18.36) because of the constraint Eq. (18.38). 

In the framework of standard quantum theory, there is a natural semi-classical approxi- 
mation that treats the vector potential classically and the scalar field quantum mechanically. 
The scalar field is described by a wave functional x (¢, t) which satisfies 


52 
ia = f ao(- sprag t IDO’ +m il?) x (18.40) 


and the electromagnetic field satisfies Maxwell’s equations (with Ap = 0) 


Auk = (xf Ix) (18.41) 

where 

fh =| vrew =e f Dow ("= -0) 
(xl Ix) = | De = soe ap)? 
(xlfix) = ie [ Dorw? (oD"e" — 6*D9) (18.42) 
with 
5 

C(x) = e(o" @ J 10) (18.43) 


the charge density operator in the functional Schrédinger picture. This theory is consistent 
since 0,, (Xx "| x) = 0, as a consequence of the Schrédinger equation (18.40). 

A natural guess for a Bohmian semi-classical approximation similar to the usual one 
is the following (and can be obtained from the full Bohmian approach by considering the 
conditional wave function x(¢,t) = W(¢, A(t), t)). An actual field @ is introduced that 
satisfies d = 6S/d5*, where the wave functional satisfies Eq. (18.40), and Maxwell’s 


equations read 0,F"" = j”, where j/ is the classical expression for the charge current. 
However, the second order equation for the Bohmian field is 
. 5Q* 
=p? 2p =- , 18.44 
b- DO +m = —T (18.44) 


5p 
equations imply that COX = 0 or OX = QX(\|*). This is a constraint on the wave func- 
tional that was absent in the usual semi-classical theory. It also seems to be a rather strong 
condition. It will, for example, be satisfied if the scalar field evolves classically (i.e. when 
the right-hand side of Eq. (18.44) is zero) but it is unclear whether there are other solutions. 


where Q% = 77 Lf ax (34 Sxl ). As a consequence, 0,,j = —iCQ* and hence Maxwell’s 
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So the conclusion seems to be that if we assume A classical, then ¢ should also behave 
classically. This is not surprising since the gauge symmetry implies that the physical (i.e. 
gauge invariant) degrees of freedom are some combination of the fields A and @. So one 
can not just assume A classical and keep @ fully quantum. 

In Struyve [12], we showed that the problem disappears if we eliminate the gauge 
freedom, for example, by using a gauge which completely fixes the gauge freedom. We 
discussed in detail the Coulomb and the unitary gauge and showed that a semi-classical 
approximation can easily be obtained by considering either the scalar or electromagnetic 
field classically. 

Let us consider the Coulomb gauge V - A = 0 here. In the full Bohmian approach 
[12, 15] we have that there are actual fields’ @ and A’ that are guided by a wave functional 
W(@, A’, t) which satisfies the functional Schrédinger equation 


2 2 
id,v = [bx —og tly ica oP emia? Cua spernts xA’)? )w. 
(18.45) 
The first three terms in the Hamiltonian correspond to the Hamiltonian of a scalar field 
minimally coupled to a transverse vector potential. The fourth term corresponds to the 
Coulomb potential and the remaining terms to the Hamiltonian of a free electromagnetic 
field. The guidance equations are 


d= ar-az, A= (18.46) 
Defining 
Ao = ~ies, (18.47) 
we can rewrite the guidance equation for the scalar field as 
Dod = a (18.48) 
5p* 


The definition of Ag was motivated by analogy with the classical equations of motion [12]. 

While this Bohmian approach is equivalent to the one in the temporal gauge [12], it nat- 
urally leads to the following semi-classical approximation (by considering the conditional 
wave function for the scalar field). The wave functional x (@, f) satisfies 


5° | aoa 
; = 3 ALY AZ 21 112 


and guides the actual scalar field through 


os 
Dod = 30* 


18.50 
a (18.50) 


7 We have used the decomposition A = AT + AL, where A? and AL are, respectively, the transverse and 
longitudinal part of the vector potential. The Coulomb gauge then corresponds to AL = 0. 
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where Ag is defined as before and with S now the phase of x. The vector potential A“ = 
(Ap, A“) satisfies Maxwell’s equations 


a,FYY = 7” +75. (18.51) 


where Jo = (0, jg) is an additional “quantum” current, with 


2. wep ll 
jo =iV wien (18.52) 
and 5 
1 six] 1,1 
x= | B =(=-¢ 18.53 
= a “(sap v2 a) a 


the quantum potential. These equations are consistent in the sense that 0,,(7" + jo) = 0, as 
a consequence of the second order equation 


D,D"¢ — mo = — : (18.54) 


which follows from taking the time derivative of Eq. (18.50). 


18.5 Quantum Gravity: Mini-Superspace Model 


The structure of the Bohmian approach to canonical gravity (outlined in Section 18.2.3) is 
similar to that of scalar electrodynamics in the temporal gauge. Namely, in both cases there 
is a constraint on the wave functional which expresses invariance under infinitesimal gauge 
transformations: spatial diffeomorphisms in the former case and phase transformations in 
the latter case. In both cases the gauge invariance seems to be the source of the problem 
in formulating a consistent Bohmian semi-classical approximation. In the case of quan- 
tum electrodynamics the problem was overcome by gauge fixing. Presumably one can find 
a similar solution in quantum gravity. However, finding a suitable gauge is a notoriously 
hard problem in this case. We can, however, consider a mini-superspace model which is 
a symmetry-reduced approach to quantum gravity where homogeneity and isotropy are 
assumed. In this case, the spatial diffeomorphism invariance is eliminated and we can 
straightforwardly develop a Bohmian semi-classical approximation, as we will now show. 
In the classical mini-superspace model, the universe is described by the FLRW metric 


ds* = N()7d?? — a(t)*dQ%, (18.55) 


where N is the lapse function, a = e® is the scale factor’ and dni is the metric on 3-space 
with constant curvature k. Assuming matter that is described by a homogeneous scalar field 


8 The reason for introducing the variable @ is that it is unbounded, unlike the scale factor, which satisfies a > 0. 
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¢, the equations of motion are [12, 34, 35]: 


Lo. bfx 

sa° =—|-¢ 4+Vu)+Ve, (18.56) 
2 k\2 

b + 34g + 0gVu = 0, (18.57) 


where the gauge N = 1 is chosen,? k = 3/41G, 


V re mua i (18.58) 
= —xke = ; 
es 6 


is the gravitational potential, with A the cosmological constant, and Vjy is the potential for 
the matter field. 
Canonical quantization of the classical theory yields the Wheeler-DeWitt equation: 


(Ag +Hwv =0, (18.59) 


where 


a 1 
2 3 = 2 3 
oe “Vo, Hu = — 7306 +e" Va. (18.60) 


In the corresponding Bohmian approach [35], there is an actual FLRW metric of the 
form Eq. (18.55) and scalar field, whose time evolutions are determined by the guidance 
equations 


ee 


. ;  N 
a= =5 age ) = aaa 065 (18.61) 


where N is an arbitrary lapse function.!° In the gauge N = 1, these equations imply 


ake (5 + Vu + o',) +Ve+ Or (18.62) 
2 ie \2 “ o 
b + 3d@ + 03(Vu + OF, + KO%) = 0, (18.63) 
where ; ; 
r= _— a , OA = (18.64) 


We will now look for a semi-classical approximation where the scale factor behaves 
approximately classically. In order to do so, we assume again the gauge N = | and we 


9 The theory is time-reparameterization invariant. Solutions that differ only by a time-reparameterization 
are considered physically equivalent. Choosing the gauge N = 1 corresponds to a particular time- 
parameterization. 

10 Just as the classical theory, the Bohmian approach is time-reparameterization invariant. This is a special feature 
of mini-superspace models [36, 37]. As mentioned before, for the usual formulation of the Bohmian dynamics 
for the Wheeler—DeWitt theory of quantum gravity, a particular space-like foliation of space-time or, equiv- 
alently, a particular choice of “initial” space-like hypersurface and lapse function, needs to be introduced. 
Different foliations (or lapse functions) yield different Bohmian theories. 
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consider the conditional wave function x(¢,7) = W(@,a()), given a set of trajectories 
(a(t), d(t)). Using 


1X, 1) = PaW(b.2)| yoy X Os (18.65) 
we can write 
id:x = Ayxt+l, (18.66) 
where! ! 
I= ae a re: ad S| : =z |@ S)? +i8 2s| | x+Ke* (Vo+O") xX. 
a oN Ta() + Frese : a(t) G a=a(t) 
(18.67) 


When J is negligible (up to a real time-dependent function times x), Eq. (18.66) becomes 
the Schrddinger equation for a homogeneous matter field in an external FLRW metric. We 
can further assume the quantum potential oO to be negligible compared to other terms in 
Eq. (18.62). As such, we are led to the semi-classical theory: 


ix = Hux, (18.68) 
. 4 

b= SHS. (18.69) 
a? = + (164 vy + OF) + Vo = —- aS V, (18.70) 
a RN | ~ esa! a 


Let us now consider when the term / will be negligible. The quantity in brackets in 
the first term would be zero when evaluated for the actual trajectory ¢(t) (because of the 
guidance equation for a). As such, the first term will be negligible if the actual scale factor 
evolves approximately independently of the scalar field. The second term will be negligible 
if S varies slowly with respect to a or if the term in square brackets is approximately 
independent of ¢. In the latter case, the second term becomes a time-dependent function 
times x, which can be eliminated by changing the phase of x. Similarly, if (iy < Vg then 
the third term also becomes a time-dependent function times x. 

In the usual semi-classical approximation, one has Eq. (18.68) and 


1 
_@? 
2 


1 x 
= ay (xlAulx) + Vo. (18.71) 


with x normalized to one. These equations follow from Eqs. (18.1) and (18.2). In Struyve 
[12] an example is worked out for which the Bohmian semi-classical approximation gives 
better results than this approximation. (Note that Vink himself, in his seminal paper on 
applying the Bohmian approach to quantum gravity, considers a derivation of the usual 
semi-classical approximation, rather than the Bohmian one. But he hinted on the Bohmian 
semi-classical approximation in Kowalski-Gilman and Vink [38].) 


'l To obtain this equation, note that 02 = [(d¢S)* + i828 + d2|WI/|WllY + 2id¢Sde Vy, so that 92U leat) = 


[(dqS)? + ia2 + aZIwl/IwI la=a(t)X + 2idqS0;x /a&. Using this equation together with Eq. (18.59) we obtain 
Eq. (18.66). 
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18.6 Conclusion 


We have shown how semi-classical approximations can be developed using Bohmian 
mechanics. We have obtained these approximations from the full Bohmian theory by 
assuming certain degrees of freedom to evolve approximately classically. This was illus- 
trated for non-relativistic systems. If there is a gauge symmetry, like in electrodynamics or 
gravity, then extra care is required in order to obtain a consistent semi-classical theory. By 
eliminating the gauge symmetry, either by imposing a gauge or by working with gauge- 
independent degrees of freedom, we were able to find a semi-classical approximation in 
the case of scalar quantum electrodynamics. For quantum gravity, eliminating the gauge 
symmetry (more precisely the spatial diffeomorphism invariance) is notoriously hard. We 
have only considered the simplified mini-superspace approach to quantum gravity, which 
describes an isotropic and homogeneous universe, and where the diffeomorphism invari- 
ance is explicitly eliminated. More general cases in quantum gravity still need to be studied. 
For example, for the case of inflation theory, where one usually considers fluctuations 
around an isotropic and homogeneous universe, it should not be too difficult to develop a 
Bohmian semi-classical approximation. 

Apart from possible applications in quantum cosmology, such as inflation theory, it 
might also be interesting to consider potential applications in quantum electrodynamics 
or quantum optics. In particular, since the results may be compared to the predictions of 
full quantum theory, this may give us a handle on where to expect better results for the 
Bohmian semi-classical approximation compared to the usual one in the case of quan- 
tum gravity where the full quantum theory is not known. That is, it might give us better 
insight into which effects are truly quantum and which effects are merely artifacts of the 
approximation. 

Further developments may include higher order corrections to the semi-classical approx- 
imation. One way of doing this might be by following the ideas presented in Norsen [39] 
and Norsen ef al. [40]. As explained there, one might introduce extra wave functions for a 
subsystem in addition to the conditional wave function. These wave functions interact with 
each other and the Bohmian configurations. By including more of those wave functions 
one presumably obtains better approximations to the full quantum result. 

Finally, although we regard the Bohmian semi-classical approximation for quantum 
gravity as an approximation to some deeper quantum theory for gravity, one could also 
entertain the possibility that it is a fundamental theory on its own. At least, there is 
presumably as yet no experimental evidence against it. 
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Part V 
Methodological and Philosophical Issues 


19 
Limits of Time in Cosmology 


SVEND E. RUGH AND HENRIK ZINKERNAGEL 


19.1 Introduction 


What does time mean in cosmology? Are there any physical conditions which must be 
satisfied in order to speak about cosmic time? If so, how far back can time be extrapolated 
while still maintaining it as a well-defined physical concept? We have studied these ques- 
tions in a series of papers over the last ten years. The present chapter is a summary of some 
main points from our investigations, as well as some further considerations regarding time 
in cosmology. 

It is standard to assume that a number of important events took place in the first tiny 
fractions of a second “after” the big bang. For instance, the universe is thought to have 
been in a quark-gluon phase between ~10~'!-107> seconds, whereas the fundamental 
material constituents are massless due to the electroweak (Higgs) transition at times earlier 
than ~107!! seconds. A phase of inflation is envisaged (in some models) to have taken 
place around ~10~*4 seconds after the big bang. A rough summary of the phases of the 
early universe is given in Figure 19.1 (next page).! 

What could be wrong (or at least problematic) with this backward extrapolation from 
now? A main point is that physical time in relativity theory, in contrast to a purely math- 
ematical parameter with the label t, is bound up with the notion of proper time.” For 
example, Misner, Thorne and Wheeler write: 


... proper time is the most physically significant, most physical real time we know. It corresponds to 
the ticking of physical clocks and measures the natural rhythms of actual events. 


Misner, Thorne and Wheeler (1973, p. 813) 


The connection between physical time and proper time leads to two kinds of problems 
for the backward extrapolation. The first of these follows from the fact that proper time is 
closely related to physical clocks or processes. The nature (and availability) of such clocks 
or processes changes as we go back in time. The problem in this regard was hinted at 


! For a detailed discussion of (the assumptions behind) this figure and the epochs indicated, see also Rugh and 
Zinkernagel (2009). 

2 Proper time along a (timelike or lightlike) world line (the path of a particle in four-dimensional spacetime) can 
be thought of as the time measured by a “standard” clock along that world line. 
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Figure 19.1 Contemplated phases of the early universe. The indicated quantum and scale problems 
for time are discussed in the text. 


by Misner, Thorne and Wheeler in connection with a discussion of whether a singularity 
occurs at a finite past proper time. They note that no actual clock can adequately time the 
earliest moments of the universe: 


Each actual clock has its “ticks” discounted by a suitable factor - 3 * 10’ seconds per orbit from the 
Earth-sun system, 1.1 * 107 10 seconds per oscillation for the Cesium transition, etc. Since no single 
clock (because of its finite size and strength) is conceivable all the way back to the singularity, a 
statement about the proper time since the singularity involves the concept of an infinite sequence of 
successively smaller and sturdier clocks with their ticks then discounted and added. [...] ... finiteness 
[of the age of the universe] would be judged by counting the number of discrete ticks on realizable 
clocks, not by accessing the weight of unrealizable mathematical abstractions. [Our emphasis] 


Misner, Thorne and Wheeler (1973, p. 814) 


The authors’ discussion regarding this quote seems to imply that the progressively more 
extreme physical conditions, as we extrapolate the standard cosmological model back- 
wards, demand a succession of gradually more fine-grained clocks to give meaning to (or 
provide a physical basis of) the time of each of the epochs.? In this spirit, our view is that 
a minimal requirement for having a physical notion of time (with a scale) is that it must 
be possible to find physical processes (what we call “cores of clocks”) with a sufficiently 
fine-grained duration in the physics envisaged in the various epochs of cosmic history. As 
we shall discuss below, this requirement of linking time to conceivable cores of clocks 
leads to a scale problem for time, since it becomes progressively more difficult to identify 
physical processes with a well-defined (and non-zero) duration in the very early universe. 

A second kind of problem with the backward extrapolation follows since proper time 
is defined in terms of (possible) particle world lines or trajectories. Within the standard 
cosmological model, there is a privileged set of such world lines since matter on large 
scales is assumed to move in a highly ordered manner (allowing for the identification of 
a comoving reference frame and a global cosmic time equal to the proper time of any 


3 Regarding Misner, Thorne and Wheeler’s examples in the quote, it is clear that one has to distinguish between 
how fine-grained a clock is (its precision) and when (in which cosmological epoch) such a clock could in 
principle be realized. For instance, no stable Cesium atoms — let alone real functioning Cesium clocks — can 
exist before the time of decoupling of radiation and matter, about 380,000 years after the big bang. 


Limits of Time in Cosmology 379 


comoving observer). As we shall discuss, this implies that the notion of cosmic time is 
closely related to the so-called Wey] principle. Problems with the notion of a global cosmic 
time may arise if a privileged set of world lines becomes difficult to identify, e.g. in the 
very early universe above the electroweak (Higgs) phase transition or in a (complicated) 
inhomogeneous universe. 

A more serious problem for time (which is a problem even for a local definition of time) 
arises if a point is reached in the backward extrapolation where the world lines themselves 
can no longer be identified. In particular, this appears to be the case if some point is contem- 
plated, e.g. at the onset of inflation, where all constituents of the universe are of a quantum 
nature, leading to what can be called the guantum problem of time. Note that this problem 
arises roughly ten orders of magnitude “before” (in the backward extrapolation from now) 
reaching a possible quantum gravity epoch, and so before hitting the usual problem of time 
in quantum gravity models. 

In the following, we first outline the scale problem for time and the close relation 
between time and clocks. We then address the relation between time and world lines in 
the set-up of the standard cosmological model. We briefly indicate how this relation may 
lead to problems for a global (cosmic) time concept in the very early universe above the 
electroweak phase transition or in a (complicated) inhomogeneous universe. We finally 
discuss the more serious local (quantum) problem for time in relation to the problem of 
identifying individual world lines. 


19.2 Time and Clocks 


The idea that time is dependent on change and/or motion is called relationism. It has been 
defended by classic thinkers like Aristotle and Leibniz, and in modern times by physicists 
like Barbour, Smolin and Rovelli. In our version of relationism, we argue in favour of 
a “time—clock” relation which asserts that time, in order to have a physical basis, must 
be understood in relation to physical processes which act as “cores” of clocks (Rugh 
and Zinkernagel, 2005, 2009; see also Zinkernagel, 2008). In the cosmological context, 
the time-—clock relation implies that a necessary physical condition for interpreting the t 
parameter of the standard Friedmann—Lemaitre—Robertson—Walker (FLRW) model as cos- 
mic time in some “epoch” of the universe is the (at least possible) existence of a physical 
process which can function as a core of a clock in the “epoch” in question. In particular, 
we have suggested that in order to make the interpretation 


t + time, 


at a specific cosmological “epoch”, the physical process acting as the core of a clock 
should: 1) have a well-defined duration which is sufficiently fine-grained to “time” the 
epoch in question; and 2) be a process which could conceivably take place among the 
material constituents available in the universe at this epoch. 
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The time—clock relation is in conformity with how time is employed in cosmology 
although cosmologists often formulate themselves in operationalist terms — that is, invok- 
ing observers measuring on factual clocks. For instance, Peacock writes concerning the 
FLRW model: 


We can define a global time coordinate t, which is the time measured by clocks of these observers — 
i.e. tis the proper time measured by an observer at rest with respect to the local matter distribution. 


Peacock (1999, p. 67) 


While this reference to clocks (or “standard” clocks) carried by comoving observers is 
widely made in cosmology textbooks, there is usually no discussion concerning the origin 
and nature of these clocks. Part of the motivation for our investigations has been to provide 
a discussion of this kind. 

The standard definition of the global time coordinate to which Peacock refers — and, 
in general, the question of how to make the f-time identification — can be read in at least 
two different ways: 1) Actual clocks should be available (operationalism); or 2) rudiments 
(cores) of clocks with a well-defined duration should, in principle, be present (time—clock 
relation). Clearly, the first possibility is not an option in the very early universe where no 
actual clocks, let alone observers and measurements, are available. As we shall see below, 
the viability of the second option depends upon the availability of physical processes with 
well-defined (and non-zero) duration. 

We attempt to develop a position on the time concept which represents a departure from 
operationalism in several ways: (i) Time cannot be defined (reductively) in terms of clocks 
(since clocks and measurements depend on the time concept); (ii) no actual clocks are 
needed, we allow reference to possible (counterfactual) clocks, compatible with the physics 
of the epoch in question; (iii) we attempt to construct the cores of clocks out of available 
physics, but do not require that this core should be associated with a counter mechanism 
that could transform it into a real functioning clock; and (iv) we do not require the existence 
of observers and actual measurements. Nevertheless, the above formulated criterion for the 
t <> time interpretation of being able to identify a process with a well-defined duration 
may still have an operationalist feel. For, as we shall see below, it means that there may 
be limits to time in cases where scales can be found in the physics, but where no physical 
process (core of a clock) can be identified which could in principle exemplify or realize the 
time scale in question. 

Whereas cosmologists often refer to clocks as sketched above, they also define cosmic 
time “implicitly” by the specific cosmological model employed to describe the universe.* 
This can be done e.g. through the relation between time and the scale factor. If we for 
instance consider a radiation dominated epoch, the Einstein field equations may yield (see 
e.g. Rugh and Zinkernagel (2009), Section 4): 


R(t) « Vt 


4 This is related to a more general discussion of the implicit definition of time via natural laws, see Rugh and 
Zinkernagel (2009), Section 2.1. 
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In some sense, the scale factor here serves as a (core of a) clock. However, for this idea to 
work, one needs to have some bound system or a fixed physical length scale which does not 
expand (or which expands differently than the universe). Otherwise, there is no physical 
content of R(t) and hence no physical content of “expansion”. Eddington, for example, 
emphasized the importance of the expansion of the universe to be defined relative to some 
bound systems by turning things upside down: “The theory of the “expanding universe’ 
might also be called the theory of the ‘shrinking atom’”’ (Eddington, 1933, quoted from 
Whitrow, 1980, p. 293). 

The viewpoint presented here assumes that a physical foundation of time is closely 
related to which physical constituents are available (or at least possible) in the early uni- 
verse. Such an assumption can be circumvented if one subscribes to some sort of Platonism 
(or mathematical foundationalism) according to which a purely mathematical definition of 
time, extracted e.g. from the formalism of general relativity (or, as simply the f param- 
eter in some model), is sufficient. According to such a view, there would seem to be no 
problem in contemplating, say, periods like 10~!°° seconds after the big bang. However, 
it is widely accepted that the standard cosmological model cannot be extrapolated below 
Planck scales and, accordingly, that the t < time interpretation cannot be made for f val- 
ues below 10~** seconds. This illustrates that a physical condition (namely that quantum 
gravity effects may be neglected) can imply a limitation for the tf < time interpretation. 
But if it is accepted that there is at least one physical condition which must be satisfied 
in order to trust the backward extrapolation of the FLRW model and its time concept, it 
appears reasonable to require that also other physical conditions (which are necessary to 
set up the FLRW model) should be satisfied during this extrapolation.> Hence, we take it 
that Platonism is not a satisfactory position regarding time in cosmology. In our view, time 
has to have some physical basis (i.e. it must be embedded in the available physics) in order 
to be a well-defined physical concept. 


19.2.1 The Scale Problem for Time 


Let us now shortly review how the above considerations may lead to a scale problem for 
time in the early universe. The scale problem for time is related to two contemplated phase 
transitions at ~10~*s and ~107''s in the early universe, where the notion of length and 
time scales (and their physical underpinning in terms of cores of clocks and rods) becomes 
progressively weaker and disappears at ~10~!!s (if we consider well-known physics set 
by the standard model of particle physics). 


~10~>s: No bound systems (the quark-hadron phase transition) 


Physically based time and length scales are not independent notions. Einstein discussed 
an elementary clock system, “the light-clock”, which involves the propagation of a light 


5 We shall return to this requirement in the final section. We also note that — except for the space-time singu- 
larity itself — there are no internal contradictions in the mathematics of the FLRW model (or classical general 
relativity) which suggests that this model should become invalid at some point, e.g. at the Planck scale. 
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signal across some physical length scale as when a light signal is being reflected back 
and forth between the ends of a rigid rod. If we ask whether we in principle may build 
up such Einstein light-clocks from the constituents of the early universe we note, as we 
extrapolate backwards in time (and the temperature rises), that it becomes progressively 
more difficult to find any spatially extended physical systems. In the so-called hadron era 
there are still bound (hadron) systems such as pions, neutrons and protons. At a transition 
temperature of T ~10!* K (~ 107s after the big bang) it is, however, believed that there is a 
quark—hadron phase transition, and above this transition point no bound states are left. The 
universe then consists of particles (like quarks, leptons, gluons and photons) which have 
no known spatial extension. If a rudiment (or core) of a rod has to be constructed from a 
bound physical system, we no longer have such rudiments of rods left in the universe, and 
we have to look elsewhere for physics which can set a physical length scale. 

The quarks and leptons still possess the physical property of mass. Thus, one still has 
length scales if the Compton wavelength A = Ac = h/(mc) of these particles can be taken 
to set such a scale. However, a rod with spatial extension equal to the Compton wavelength 
leads to a “pair-production of rods” as a quantum effect (in general, the Compton wave- 
length is the length scale at which “pair production” of particle-antiparticle pairs occurs). 
It is thus difficult to imagine how the Compton wavelength divided by c corresponds to a 
physical process which could function as the core of a clock, e.g. in the above-mentioned 
light-clock. 

Note that these considerations, and hence our first proposed time limit at 10~5 seconds, 
are based on the somewhat operationalist premise that one should in principle be able to 
identify a core of a clock (physical process) with a well-defined duration. The contemplated 
process is one in which a light signal travels a well-defined distance (namely a Compton 
wavelength of a quark or a lepton), but this process seems physically unrealizable insofar 
as the photon is converted to a particle-antiparticle pair during flight.° 


~10~—""s: Scale invariance (the electroweak phase transition) 


According to the standard model of particle physics (which embodies a Higgs sector with a 
(set of) scalar field(s) @) there is an electroweak phase transition at a transition temperature 
of T ~300 GeV ~10!° K when the universe was ~107!'s old. Above this phase transition 
the Higgs field expectation value vanishes < ¢ >= 0. This transition translates into zero 
rest masses of all the fundamental quarks and leptons (and massive force mediators) in 
the standard model. Without any masses in the theory it will exhibit a symmetry known 
as conformal invariance, and it will be impossible to find physical processes (among the 
microphysical constituents) with a well-defined (and non-zero) duration.’ 


6 As different sorts of rudiments (cores) of clocks, we may consider the decay processes of unstable, massive 
particles such as the decay of the muons p~ — e” Vev,, or the decay of the Zo particles, Zo ff (which 
can decay into any pair of fermions). But, as discussed in Rugh and Zinkernagel (2009), also these processes 
are difficult to conceive as functioning cores of clocks due to their quantum-mechanical and statistical nature. 
See however Rugh and Zinkernagel (2009, Section 5.3) for a brief discussion of some possible rudiments of 
mass which could remain above the Higgs transition (but which are, in our assessment, insufficient to ground 
e.g. a physical time scale). 
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Thus, not only can no core of a clock be identified. The relevant physics (the electroweak 
and strong sector) cannot set physical scales for time, scales for length and no scale for 
energy. If there is no scale for length and energy then there is no scale for temperature T. 
Metaphorically speaking, we may say that not only the property of mass of the particle 
constituents “melts away” above the electroweak phase transition but also the concept of 
temperature itself “melts” (i.e. T loses its physical foundation above this transition point). 

In our assessment, therefore, the time scale assumed (e.g. in cosmology books) above 
the electroweak phase transition is purely speculative in the sense that it cannot be founded 
upon an extrapolation of well-known physics (due to conformal invariance) above the 
phase transition point. Thus, the time scale will have to be founded on the introduction 
of some new physics (beyond the standard model of particle physics), and is in this sense 
as speculative as the new (speculative) physics on which it is based. 

It is of interest that Roger Penrose has recently attempted to turn the scale problem for 
time into a virtue in the construction of a new kind of cosmological scenario. Penrose cites 
our study® in connection with the following quote: 


...close to the Big Bang, probably down to around 10—!2 seconds after that moment, when tem- 
peratures exceed about 10!© K, the relevant physics is believed to become blind to the scale factor 
Q, and conformal geometry becomes the space-time structure appropriate to the relevant physical 
processes. Thus, all this physical activity would, at that stage, have been insensitive to local scale 
changes. [Emphasis in original] 


Penrose (2010, p. 142) 


Our mentioning this point does not imply an endorsement of Penrose’s proposal of an 
“Extraordinary New View of the Universe” (a conformal cyclic cosmology) in which 
approximate conformal invariance holds in both ends (the beginning and the remote future) 
of the universe. Nevertheless, there seems to be a certain agreement in philosophical out- 
look, also when Penrose mentions (2010, p. 93): “It is important for the physical basis 
of general relativity that extremely precise clocks actually exist in Nature, at a fundamen- 
tal level, since the whole theory depends upon a naturally defined metric g” (our emphasis). 


Why not refer to the Planck scales? 


The combination of the constants fA and c from relativistic quantum mechanics, and c 
and G from classical general relativity yields — as a mathematical combination of phys- 
ical constants — the famous Planck scales. As concerns time, the Planck time scale 
tp = (AG/c?)'/* ~10~* seconds is immensely more fine-grained than time scales set 
by any physical process which we (in our investigations) have attempted to utilize as rudi- 
ments of clocks at various stages in cosmic history. /f the Planck time scale were considered 
sufficient to provide a physical basis for the time scale in the early universe, then there 


8 Our study appeared as a handout in a first print in 2005 (Rugh and Zinkernagel, 2005) and was published in a 
revised version in 2009. 


384 S. E. Rugh and H. Zinkernagel 


would be no scale problem for time anywhere along the extrapolation from now to the 
Planck times. 

However, we see several related reasons to be suspicious that the Planck time scale does 
indeed provide a sufficient physical basis for the time scale in the early universe. First of 
all, the Planck scales are supposed to be the physical relevant scales of theories of quantum 
gravity, and such theories are still highly speculative. The Planck time scale is therefore 
at least as speculative as any other imagined time scale above the electroweak phase tran- 
sition. Second, it is expected that quantum gravity effects are totally negligible at energy 
scales around the electroweak phase transition point (and negligible well into the “desert” 
above this phase transition). It appears dubious to ground time scales of Higgs-physics 
on quantum gravity effects which are irrelevant at Higgs-physics scales.? Third, even if we 
bypass the second problem, one may well question how physically reasonable the supposed 
physical processes would be for grounding the Planck scale (recall that, in our view, a phys- 
ical basis for a time scale should be related to relevant physical processes). The Planck 
scales may be arrived at by setting the Compton wavelength equal to the Schwarschild 
radius of a black hole. It is thus a characteristic scale at which there is a pair production 
of black holes as a quantum effect. Consider this in the context of the discussion above on 
time and length scales in connection with the light-clock: At the Planck length scale there 
is a “pair production of rods” (the rod being the spatial extension of the quantum black 
hole), and the corresponding Planck time scale is the time it takes a light pulse to cross this 
length scale. This appears to be more of a mathematical construct than a conceivable phys- 
ical process since the crossing of a light pulse is hardly a well-defined physical process 
within such violent fluctuations in the geometry. For instance, one may ask at which of the 
two pair produced quantum black holes the light pulse is supposed to end its “crossing”?!° 

In our assessment, then, if one wants to solve (or dissolve) the scale problem for time 
at and above the Higgs transition by referring to speculative processes in quantum gravity, 
such as “quantum pair production of black holes’, then it should at least be admitted that 
the cosmic time scale constructed in this way is highly speculative.!! 


19.3 Time and World Lines 


The above section has focused on the consequences of proper time being related to clocks. 
As we Saw, this relation leads to the idea that time is related to physical processes — which 


9 Note that today we define a second as 9.192.631.770 vibrations of radiation caused by well-defined transitions 
in Cs-133 (see e.g. *t Hooft and Vandoren, 2014). This is a physical grounding (even operationally) of a 
time scale (a second) in terms of physical processes taking place at time scales substantially (ten orders of 
magnitude) smaller. It would be of interest if one could speculate any effect on Higgs-scale physics stemming 
from the quantum gravity scales 30 orders of magnitude below. 

Note that this third reason against using the Planck scale as a physical basis for the time scale in the early 
universe is, just like the 10~> seconds limit discussed above, based on the somewhat operationalist premise 
that we should be able to point to a process (a core of a clock) which provides a definite time interval. 

As we shall discuss later, the quantum problem of time (Section 19.3.3) does not depend on whether we have 
a physically well-founded scale for time. This problem remains (e.g. at the onset of inflation) even if we base 
length and time scales (throughout cosmic history) on speculative Planck scale physics. 
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is a version of what is known as relationism. But there is a more direct route to relation- 
ism in cosmology which is independent of the mentioned time—clock relation (even if in 
conformity with it). This has to do with the fact that proper time is defined in terms of 
(possible) particle world lines. In the following we shall discuss how this implies a close 
relation between time and cosmic matter, both at the global and at the local level. 


19.3.1 Setting up the FLRW Model with a Cosmic Time 


In Rugh and Zinkernagel (2011) we discuss how the set-up of the FLRW model with a 
global time is closely linked to the motion, distribution and properties of cosmic matter. 
We now briefly review some key points of this discussion. 

In relativity theory time depends on the choice of reference frame. For the universe, 
a reference frame cannot be given from the outside, so such a frame has to be “built up 
from within", that is, in terms of the (material) constituents within the universe. It is often 
assumed that the FLRW model may be derived just from the cosmological principle. This 
principle states that the universe is spatially homogeneous and isotropic (on large scales). 
It is much less well known that another assumption, called Weyl’s principle, is necessary in 
order to arrive at the FLRW model and, in particular, its cosmic time parameter.'* Whereas 
the cosmological principle imposes constraints on the distribution of the matter content 
of the universe, Weyl’s principle imposes constraints on the motion of the matter con- 
tent. Weyl’s principle (from 1923) asserts that the matter content is so well behaved that a 
reference frame can be built up from it: 


Weyl’s principle (in a general form): The world lines of “fundamental particles’ form a 
spacetime-filling family of non-intersecting geodesics (a congruence of geodesic world 
lines). 


The importance of Weyl’s principle is that it provides a reference frame which is phys- 
ically based on an expanding “substratum” of “fundamental particles” (e.g. galaxies or 
clusters of galaxies). In particular, if the (non-crossing) geodesic world lines are required 
to be orthogonal to a series of space-like hypersurfaces, a comoving reference frame is 
defined in which constant spatial coordinates are “carried by” the fundamental particles 
(see e.g. Figure 3.7 in Narlikar, 2002, p. 107). The time coordinate is a cosmic time which 
labels the series of hypersurfaces, and which may be taken as the proper time along any of 
the particle world lines. We note that the congruence of world lines is essential to the stan- 
dard cosmological model since the symmetry constraints of homogeneity and isotropy are 
imposed with respect to such a congruence (see e.g. Ellis, 1999). Thus, Weyl’s principle is 
a precondition for the cosmological principle; the former can be satisfied without the latter 
being satisfied but not vice versa. 


12 Tn some cosmology textbooks — e.g. by Bondi, Raychaudhuri and Narlikar — the importance of Weyl’s prin- 
ciple is emphasized and explicitly referred to. In other textbooks it appears, in our assessment (see Rugh and 
Zinkernagel, 2011), that the Wey] principle is implicitly assumed in the process of setting up the FLRW model. 
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In the early universe, problems may arise for the Weyl] principle and thus for the pos- 
sibility of identifying a reference frame and a global cosmic time parameter.!? At present 
and for most of cosmic history, the comoving frame of reference can be identified as the 
frame in which the cosmic microwave background radiation (CMB) looks isotropic (see 
e.g. Peebles, 1993, p. 152), and cosmic matter is (above the homogeneity scale) assumed to 
be described as dust particles with zero pressure which fulfill Weyl’s principle. But before 
the release of the CMB, the situation is less straightforward. For, as we go backwards in 
time, it may become increasingly more difficult to satisfy, or even formulate, the Wey] prin- 
ciple as a physical principle, since the nature of the physical constituents is changing from 
galaxies, to relativistic gas particles, and to entirely massless particles moving with veloc- 
ity c.'+ Indeed, above the electroweak phase transition (before 10~!! seconds “after” the 
big bang), all constituents are massless and move with velocity c in any reference frame. 
There will thus be no constituents which are comoving (at rest). One might attempt to con- 
struct mathematical points (comoving with a reference frame) like a center of mass (or, in 
special relativity, center of energy) out of the massless, ultrarelativistic gas particles, but 
this procedure seems to require that length scales be available in order to e.g. specify how 
far the particles are apart (which is needed as input in the mathematical expression for the 
center of energy). As discussed earlier, the only option for specifying such length scales 
(above the electroweak phase transition) will be to appeal to speculative physics, and the 
prospects of satisfying Weyl’s principle (and have a cosmic time) will therefore also rely 
on speculations beyond current well-established physics. The problem of building up the 
FLRW model with matter consisting entirely of consituents moving with velocity c may 
also be seen by noting that the set-up of the FLRW model requires matter (the energy- 
momentum tensor) to be in the form of a perfect fluid, as this is the only form compatible 
with the FLRW symmetries, see e.g. Weinberg (1972, p. 414). For this, a source consisting 
of pure radiation is not sufficient since one cannot effectively simulate a perfect fluid by 
“averaging over pure radiation".!> 

On top of this, the physical basis of the Wey] postulate (e.g. non-intersecting world lines 
of “fundamental particles’), and even that of proper time, appears questionable if some 
period in cosmic history is reached where the “fundamental particles” are described by 
wave-functions y(x, ft) referring to (entangled) quantum constituents. What is a “world 
line” or a “particle trajectory” then? (See also the section below on the quantum problem 
for time.) 


13 Th Rugh and Zinkernagel (2013) we also argue that there is no approximate fulfillment of a Weyl principle and 
no well-defined global (multiverse) cosmic time concept in the eternal inflationary multiverse model outlined 
e.g. by Linde and Guth. 

In the early radiation phase, matter is highly relativistic (moving with random velocities close to c), and the 
Weyl principle is not satisfied for a typical particle but one may still introduce fictitious averaging volumes in 
order to create substitutes for “galaxies which are at rest’; see e.g. Narlikar (2002, p. 131). 

Krasinski (1997, pp. 5—9) notes that the energy-momentum tensor in cosmological models may contain many 
different contributions, e.g. a perfect fluid, a null fluid, a scalar field, and an electromagnetic field. He also 
emphasizes that a source of a pure null fluid or a pure electromagnetic field is not compatible with the FLRW 
geometry, and that solutions with such energy-momentum sources have no FLRW limit (Krasinski 1997, p. 
13). 
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In the following we shall briefly question what happens to cosmic time if/when we 
cannot assume the validity of the standard FLRW model (next Subsection), before we 
turn to the question of what happens if/when the cosmic constituents become quantum 
(Subsection 19.3.3). 


19.3.2 Cosmic Time in an Inhomogeneous Universe 


The cosmological standard model is highly idealized and it is therefore of interest to inquire 
about cosmic time when the model’s idealizing assumptions are relaxed. In particular, one 
may ask whether we still have a good cosmic time concept in our actual — at least up to 
very large scales — inhomogeneous universe? It is well known that close to massive objects 
time runs differently than in more “void like” segments of spacetime away from any such 
massive objects. The complexity of constructing a privileged time notion in such situations 
has been illustrated e.g. in the following example by Feynman. 

How old is our earth? Since clocks (time) run differently in different gravitational poten- 
tials (time dilation in a gravitational field), time will run at a different rate in the center of 
the earth than on the surface of the earth. Feynman remarks: 


... we might have to be more careful in the future in speaking of the ages of objects such as the earth, 
since the center of the earth should be a day or two younger than the surface! 


Feynman (1995, p. 69) 


In fact, the situation is slightly worse, for integrating up a relative time dilation factor of 
At/t = A®/c* over a coarse estimate of the (not precisely defined) lifespan of our earth 
(~ 5 x 10° years) yields some years in time difference.!® 

In a universe with an inhomogeneous distribution of the material constituents, the situ- 
ation is less clear than in the Feynman example of a slightly inhomogeneous gravitational 
field throughout our earth. In some mathematically simplified spatially inhomogeneous 
models, it may be possible to maintain a Wey] principle and a notion of a global cosmic 
time (cf. e.g. Krasinski (1997) and references therein). However, if our universe exhibited 
fractal behavior and collisions on all scales it would be difficult to uphold a Wey] principle 
(even in an “average sense” where small scale collisions and inhomogeneities are averaged 
out). We may add that such fractal behavior and collisions on all scales appear to be a 
characteristic of envisaged multiverse inflationary scenarios like chaotic inflation, see e.g. 
discussion and references in Rugh and Zinkernagel (2013). 

Not least due to the observed microwave background isotropy (and the remarkable 
isotropy of X-ray counts, radio source counts, and y-ray bursts) it is generally expected (yet 


16 Tn an order of magnitude estimate we may assume that our earth is homogeneous and the potential difference 
between the center and the surface of our earth is then integrated up to A® = GM/2R which translates into 
a relative time dilation effect At/t = A®/c? = 1/4 x (Rschw/R)~1/3 x 107? (here Rschy = 2GM/c? is 
the Schwarzschild radius of our earth with mass M and radius R). Integrating this relative time dilation over 
~5 x 10° years yields an order of magnitude estimate of ~2 years for the age difference (as measured by 
counterfactual clocks located in the center and at the surface of our earth over the lifespan of our earth). 
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debated) among cosmologists that there will be a transition from small-scale fractal behav- 
ior to large-scale homogeneity.!” A recent study arguing this case is e.g. Scrimgeour et al. 
(2012).!8 Nevertheless, even if our universe is not fractal at the largest — but only at inter- 
mediate — distance scales, it is an interesting question how significantly this may change the 
cosmic time concept of the resulting cosmological model. Indeed, inhomogeneous models 
with a fractal matter distribution at intermediate scales will presumably exhibit more com- 
plicated conceptions of cosmic time than in the highly symmetric, idealized FLRW model 
universes. 

One way to address what happens in an inhomogeneous universe is to attempt to con- 
struct a notion of cosmic time associated with an event (here, now) by looking at the proper 
time (e.g. Misner, Thorne and Wheeler (1973), §13.4 §27.4) 


t= T(y)= / V 8puvdxtdx” , (19.1) 
y 


along (particle) timelike world lines (indicated with subscript y and with 4-velocities 
u(v) = dx(v)/dv), which starts at the beginning of space time, and ends in the event 
(here, now).!? But along which world lines y should the proper time integral be taken? 
Ellis (2012, pp. 9, 10) proposes to take the proper time integral along a specific set of 
preferred fundamental world lines, which (for realistic matter) are uniquely geometrically 
determined. This construction does not invalidate the Wey] principle but rather builds on 
it and develops it (Ellis, private communication).*° The “present” is in this construction 
defined as the surface {t = constant} determined by taking the proper time integral Eq. 
(19.1) over the family of fundamental world lines starting at the “big bang". However, 
according to Ellis, the equal time hypersurfaces can in generic situations be much more 
complicated (see discussion in Ellis, 2012, p. 10) than the simple equal (cosmic) time 


17 Moreover, it has been emphasized, e.g. by Barrow (2005), that large contrasts in density 50/p are not 
necessarily mirrored in similar inhomogeneities in the gravitational potential ® since the equation of the 
relative perturbation 5®/® of the gravitational potential has in it a huge suppression factor (6®/® ~ 
5p/p x (L/(c/H ))*) if the size L of the density irregularity is small relative to the Hubble radius c/H. 

If we want to observationally test the expected homogeneity at large scales, one should pay attention to the 
danger of vicious circularities (““catch-22"). Distance measures like redshift-distance measures (at large dis- 
tances) should not have built in the assumptions we want to test (the FLRW model as space-time metric, etc.). 
The analysis provided in e.g. Scrimgeour et al. (2012) is very elaborate but it is of interest that they note (p. 
4): “To do this, we assume the FRW metric and ACDM. This is necessary for any homogeneity measure- 
ment, since we must always assume a metric in order to interpret redshifts. Therefore in the strictest sense 
this can only be used as a consistency test of ACDM. However, if we find the trend towards homogeneity 
matches the trend predicted by ACDM, then this is a strong consistency check for the model and one that an 
inhomogeneous distribution would find difficult to mimic” (our emphasis). 

Such a definition is only well-defined, i.e. the proper time is only finite, if there is a beginning of space-time 
e.g. in a “big bang” (see also Lachiéze-Rey 2014, Section 5.3), or if we chose some arbitrary starting point 
(assumed to exist) from which we can integrate (Ellis 2012, Section 3). 

We note that Ellis assumes that there is a uniquely defined vector field of 4-velocities u“(v) = dx“ (v)/dv (if 
such 4-velocities are uniquely defined in each spacetime point on the manifold; this is equivalent to assuming 
the existence of a congruence of world lines which are non-crossing). According to Ellis’ proposal, these 4- 
velocities (in order to be preferred fundamental world lines) should satisfy that they are timelike eigenlines of 
the Ricci tensor, Rypu” = Auy. 
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hypersurfaces in the FLRW model universes. In particular, Ellis remarks that the equal 
time hypersurfaces may not even necessarily be spacelike in an inhomogeneous spacetime. 

It therefore appears to be a complicated — and to our knowledge still open — question 
whether the resulting concept of cosmic time exhibits the properties which allow for a 
“backward” extrapolation into an “early” inhomogeneous universe. 


19.3.3 The Quantum Problem for Time 


We have seen above that it may be difficult to identify a global cosmic time (without the 
Wey] principle), and in earlier sections also that there may not be a scale for time (before 
the Higgs transition). Even if this is so, it might still be possible to maintain a local time 
order, i.e. to ask about the past of some particular event — for instance, the past of the 
onset of inflation. However, as we shall indicate below, it may well be that not even a local 
(and scale-free) time order is available as time is extrapolated backwards in the very early 
universe. 

The origin of this local (or quantum) problem for time is due to the widely assumed 
“quantum fundamentalist” view according to which the material constituents of the uni- 
verse could be described exclusively in terms of quantum theory at some early stage of the 
universe. Such a perspective is natural in quantum cosmology (and quantum gravity), in 
which spacetime itself is treated quantum mechanically (see also Hartle 1991). From the 
point of view of such theories, it has been argued that a quantum problem of time appears 
already (in the backward extrapolation from now) at the onset of inflation. Thus, Kiefer 
affirms that: 


The Universe was essentially “quantum” at the onset of inflation. Mainly due to bosonic fields, deco- 
herence set in and led to the emergence of many “quasi-classical branches” which are dynamically 
independent of each other. Strictly speaking, the very concept of time makes sense only after deco- 
herence has occurred. In addition to the horizon problem etc., inflation also solves the “classicality 
problem”. [...] Looking back from our Universe (our semiclassical branch) to the past, one would 
notice that at the time of the onset of inflation our component would interfere with other components 
to form a timeless quantum-gravitational state. The Universe would thus cease to be transparent to 
earlier times (because there was no time). 


Kiefer (2003, p. 208) 


The problem here seems to be that our spacetime (and therefore time) “dissolves” into a 
superposition of spacetimes at the onset of inflation, and in this sense Kiefer acknowledges 
a quantum problem of time at this point. The situation, however, might be worse (i.e. 
the quantum problem may appear earlier in a backward extrapolation from now), since the 
appeal to decoherence is questionable. To see this, consider what one might call the cosmic 
measurement problem, which addresses the quantum mechanical measurement problem in 
a cosmological context: 


The cosmic measurement problem: If the universe, either its content or in its entirety, was 
once (and still is) quantum, how can there be (apparently) classical structures now? 


390 S. E. Rugh and H. Zinkernagel 


While many aspects of the cosmic measurement problem have been addressed in the 
literature, the perspective which we have tried to add is that the problem is closely related 
to providing a physical basis for the (classical) FLRW model with a (classical) cosmic 
time parameter. As illustrated in the Kiefer quote above, an often attempted response to the 
cosmic measurement problem is to proceed via the idea of decoherence. According to this 
idea, some degrees of freedom are regarded as irrelevant (they are deemed inaccessible to 
measurements and are traced out), and they are therefore taken to act as an environment for 
the relevant variables. The picture is that the environment in a sense “observes” the system 
in a continuous measurement process and thus suppresses superpositions of the system (see 
e.g. Kiefer, 1989). 

However, as is widely known, decoherence cannot by itself solve the measurement prob- 
lem and explain the emergence of a classical world.”! For, if both environment and system 
are quantum, the total state of the system (relevant plus irrelevant degrees of freedom) 
is still a superposition. According to quantum mechanics, no definite (classical) state can 
therefore be attributed to any of the components. As argued by Sudarsky (2011, Section 
4.1), this problem is only aggravated in the cosmological context since one cannot here 
appeal to the usual pragmatic considerations regarding what classical observers and their 
measurement apparatus would register.”* In spite of such worries, Kiefer (2003) contem- 
plates that decoherence successively classicalizes different constituents of the universe: At 
the onset of inflation, the inflaton field itself is classicalized and, at the end of inflation, 
decoherence converts the quantum fluctuations of the inflaton field into classical density 
perturbations (seeds of structure).?3 

But even if one were to bypass the strong arguments against decoherence as a solu- 
tion to the cosmic measurement problem, a potentially more serious problem is lurking: If 
decoherence is to explain the emergence of classical structures, it cannot — as in envi- 
ronmentally induced decoherence — be a process in (cosmic) time, insofar as classical 
structures (particle world lines) are needed from the start to define time both locally and 
globally! There thus seems to be a vicious circularity if one invokes decoherence to explain 
the “emergence” of time, which we can formulate in slogan form: 


Decoherence takes time and cannot therefore provide time. 


This implies that several of the temporal expressions in the quote by Kiefer given above 
(“decoherence sets in", “after decoherence has occurred", etc.) are strictly speaking 
without meaning. 


21 For a simple explanation of this, and some references to the relevant literature, see e.g. Zinkernagel (2011, 
Section 2.1). 

From a pragmatic point of view, quantum mechanics may be seen as a theory of expected outcomes of mea- 
surements, in which both apparatus and observers are kept outside the quantum description. We have pointed 
out elsewhere (Rugh and Zinkernagel, 2005; Zinkernagel, 2016) that Bohr went beyond this pragmatic (or 
instrumental) interpretation. His view was rather a contextual one according to which any system can be 
treated quantum mechanically but not all systems can be treated this way at the same time. 

In this regard, Anastopoulos (2002) mentions a worry about decoherence closely related to the ones already 
noted: “...a sufficiently classical behavior for the environment seems to be necessary if it is to act as a 
decohering agent and we can ask what has brought the environment into such a state ad-infinitum”’. 


22 
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Although the discussion above has focused on decoherence, we note that the quantum 
problem of time seems to be shared by other “quantum fundamentalist” views even when 
these do not rely essentially on decoherence (e.g. the spontaneous collapse model described 
in Sudarsky, 2011). Our point is that any interpretation of quantum mechanics will need 
a time concept — which is bound up with the notion of possible (classical) particle world 
lines — in order to address the early universe. The assumption of a quantum nature of 
the material (or otherwise) constituents of the universe makes it hard (or impossible) to 
associate these with well-defined particle trajectories. During inflation the only relevant 
constituent of the universe is taken to be the inflaton field g which — in the last analysis — is 
a quantum field. And just as wave functions in non-relativistic quantum theory do not give 
rise to physical motion (of a particle or wave) in space and time — without assumptions 
solving the measurement problem — so quantum fields do not describe moving elementary 
particles in space with well-defined trajectories. 

Up to this point we have discussed the quantum problem of time from a quantum fun- 
damentalist point of view based on quantum cosmology or quantum gravity. Let us now 
proceed from the present (and more cautious) perspective, in which we start from a clas- 
sical point of view and attempt to extrapolate proper time backwards. More specifically, 
consider the past of some event by extrapolating backwards the proper time integral along a 
world line with 4-velocity u“(v) = dx" (v)/dv, which ends in the event (formula as in Eq. 
(19.1) in Section 19.3.2). This approach assumes that we know the metric and that there 
are well-defined 4-velocities. The question then becomes whether such 4-velocities (or, 
equivalently, world lines) can always be constructed, i.e. physically realized as opposed 
to merely mathematically defined, from the available constituents (e.g. from a scalar 
field y).?4 

In the inflationary scenario, the relevant candidate for constructing sensible notions of 
particle world lines and classical trajectories will have to come from the field. And even 
if we were allowed to take this field as effectively classical (described by the lowest order 
approximation in quantum field theory) during inflation (e.g. during a slow-roll evolution), 
the quantum problem of time will be faced at the on-set of inflation. At this point (supposed 
to be the “birth” of our bubble universe in a multiverse setting), the inflaton field is strongly 
quantum: Quantum fluctuations with amplitudes (within a factor of ten) of the order of the 
Planck scale are necessary to reset or lift the scalar field to a value where a new bubble 
(our universe) is born and becomes dominated by inflation (see e.g. Linde, 2004, Section 
4). Thus, at the beginning of inflation (or the “birth” of our universe), the g field is nowhere 
close to being a classical field on top of which we have small quantum fluctuations. Rather, 
it is entirely dominated by Planck scale quantum fluctuations. 

In summary, to use the local time concept for contemplating times before inflation (or, 
indeed, earlier bubble universes in the multiverse), it must be possible to identify (or, at 


24 One idea here would be to equate the energy momentum tensor of the perfect fluid form with the energy 


momentum tensor for the scalar field. This results in the 4-velocity uj, = A-0,y where A = (0"y d)y)~ 1/2, 
see e.g. Krasinski (1997, p. 8) and Hobson et al. (2006, p. 432). 
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least, to speculate) a particle world line along which proper time can be extrapolated back- 
wards.”> But, as we have seen, it is unclear how one would go about constructing any 
individual classical particle world line from the inflationary scalar field g in a regime where 
its quantum behavior is dominant (at the onset of inflation). If such world lines (classical 
trajectories) cannot be constructed from the underlying physics (the ¢ field), it seems that 
the very conditions for speaking about the past of an event in general relativity are not 
fulfilled. Hence, in our assessment, a pure quantum phase in the early universe implies that 
proper time (and even its order aspect, that is, its ability to distinguish before and after) is 
no longer a well-defined concept. 


19.4 Summary and Discussion 


It is common practice to extrapolate the standard cosmological model back to at least the 
Planck time. In this chapter, we have tried to insist that this is problematic. The underlying 
philosophical reason is that the extrapolation of the FLRW model and its time concept 
requires, in our view, that the physical basis of time in the model and, more generally, the 
physical conditions needed to set up the model, are not invalidated along this extrapolation. 
This situation gives rise to a number of possible limits of time, respectively, at ~10~°s, 
10—!'s, 107345, and 10-495 “after” the mathematical point tf = 0 in the FLRW model. 

As briefly hinted in Section 19.2, we are aware that we are here making a philosophical 
choice — at least concerning the two first limits. For we are assuming that the natural laws 
need a physical basis at all points along the extrapolation, as opposed to just having a basis 
at the present epoch (when it is easy to identify not only length and time scales, but also 
physical processes with well-defined durations). The difference between the first two lim- 
its and the Planck time (and possibly the time of onset of inflation) is that the former two 
(phase transitions) do not mark events where the natural laws are expected to break down. 
Rather, the two phase transitions are predictions of the natural laws themselves (by con- 
trast, classical gravity is expected to break down at the Planck scale). Hence, in the case 
of the first two time limits, the problem concerns the interpretation of the natural laws; 
i.e. whether we are entitled to interpret the laws as physical laws throughout the backward 
extrapolation, if the foundation for this interpretation (like the existence of cores of rods 
and clocks) disappears at some point along the extrapolation. Given our view of the inter- 
pretation of natural laws, the time concept in the early universe becomes speculative before 
the electroweak phase transition. As we have seen, before this point (~10~!! seconds), 
known physics becomes scale invariant and so one loses any (non-speculative) handle on 
how close we are to the singularity. We believe, but it should be further examined, that our 
position is a reasonable compromise between Platonism (mathematical foundationalism) 
and operationalism (which requires a method for actually measuring cosmic time). 


25 From our relationist point of view — in which time is necessarily related to physical processes — the time-like 
curves can only be identified (they only have a physical basis) if the motion of objects or test particles along 
these curves is at least in principle realizable given the available physics. 
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In Sections 19.3.1 and 19.3.2 we have seen that a global concept of cosmic time (with 
or without a scale) may become problematic in the early universe if the Weyl] principle 
cannot be satisfied (e.g. if everything moves with the speed of light and no comoving 
reference frame can be constructed). Moreover, the discussion in Section 19.3.3 showed 
that not even a /ocal concept of time, which could be used to address the past of some local 
event, may be available as time is extrapolated backwards in the very early universe. In 
particular, this seems to be the case if one assumes “quantum fundamentalism"; the idea 
that everything is quantum, and even if something looks classical now, there was an early 
time, e.g. 10~*4 seconds, when nothing did. Thus, if all constituents are quantum at the 
onset of inflation 10~*4s, it seems difficult (or impossible) to even construct a physical 
notion of proper (local) time along individual world lines which could order events in the 
very early universe. The upshot of our discussion on these points was that classical systems 
appear to be necessary throughout cosmic history (to have a reasonable time concept). It 
is standard to hold that quantum gravity sets in at 10~*s, i.e. that there is no time concept 
“before” this Planck time. But our discussion indicates that if one believes that everything 
is quantum, then one has a problem with time in general (and not only in quantum gravity)! 

Let us finally briefly consider whether the possible limits to time are a misfortune for 
cosmology. We think not. Limits in science are good for at least two reasons. First, they 
should not be seen as stumbling blocks for research but rather as invitations to keep asking 
questions, e.g. as to which theories might describe what lies beyond the present tempo- 
ral limits (or how the limits might be circumvented, e.g. by introducing speculative new 
physics). Such invitations can be expected to remain open since for any postulated theory 
describing earlier times, it will probably always be possible to ask: what lies beyond that 
theory? This leads to the second reason: The fact, if it is a fact, that there will always be 
something beyond our (current?) scientific understanding may be aesthetically attractive, 
if not also comforting.”° Both of these reasons for endorsing limits are connected to that 
feeling of wonder which has been an important driving force throughout the history of 
cosmology. 
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20 
Self-Locating Priors and Cosmological Measures 


CIAN DORR AND FRANK ARNTZENIUS 


20.1 Introduction 


It seems like bad news for a theory if it entails that almost all of those who perform a 
certain experiment get a certain result, and we actually perform that experiment and get a 
different result. But it is not immediately obvious why this should be bad news, or what 
kind of bad news it is, when the theory in question is logically consistent with the fact that 
we got the result we actually got. This is something we need to understand better. It is not 
enough just to say that other things being equal, a theory’s having this feature is a reason 
not to believe it, since other things are never equal. In all the interesting cases, the theories 
in question will have a great many other features which are, ceteris paribus, reasons to 
believe them — they may be attractively strong and simple; they may accurately predict 
the values of certain measured parameters, and so forth. We need a framework of thinking 
about the bearing of evidence on theories that can give us some guidance about how these 
factors trade off against one another. 

Indeed, we can see that it is not always bad news for a theory when we make observations 
that are atypical according to it. For consider the following theory (if you want to call it 
that): We will perform experiment E and get result A, although almost everyone else who 
performs experiment E will get result B. Since our getting any result other than A would 
refute this theory, this result looks like good news rather than bad news. So, this is another 
place where we need some guidance. 

The need for such a framework is especially pressing when we turn our attention from 
the elaborate experiments physicists are paid to perform to the ‘experiments’ we perform 
all the time whether we like it or not — for example, the experiment of standing in front of a 
mirror and seeing what you look like. There might be gazillions of non-human observers in 
the universe, most of whom see completely different things when they look into mirrors. It 
seems foolish to reject a serious theory that posits a multitude of disparately shaped aliens 
just on the grounds that you see a human-like form (two arms, two legs, ...) when you 
look in a mirror. Surely the place to look if you want to investigate a theory like that is a 
telescope, not a mirror! But it is unclear how this piece of common sense is to be reconciled 
with the idea that it is bad when a theory says that our observations are atypical. Small won- 
der that so many practising physicists are suspicious of considerations having to do with 
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the typicality of our observations — ‘anthropic’ considerations — and would prefer to be able 
to make comparisons between theories without ever having to think about such matters. 

Unfortunately for them, it is hard to see how one could possibly avoid the need to 
take such considerations into account. Since the work of Boltzmann — if not that of 
Democritus — physics has thrown up a succession of theories which seem to have the prob- 
lematic feature that they make our actual observations excessively atypical, while being 
simple and attractive in other respects, and also logically consistent with our evidence, so 
that it is not clear what can be said against them without appealing to considerations of 
atypicality. In Boltzmann’s picture, a fixed finite stock of particles continue to exist and to 
interact eternally, with the result that every possible dynamical state of those particles (with 
a fixed total energy) eventually comes arbitrarily close to being realised, including all those 
possible dynamical states that subserve the existence of observers making observations of 
any humanly possible kind. Nothing that we have observed is inconsistent with this theory. 
Its only obvious defect is that according to it, the vast majority of observations are utterly 
unlike our actual observations. They are, rather, the kinds of observations one would expect 
to be made by ‘Boltzmann Brains’ (Albrecht and Sorbo, 2004) — short-lived, isolated 
observers who came into existence as part of a recent, localized fluctuation from equi- 
librium. If one denies that this is any kind of problem for Boltzmann’s theory, one seems 
forced into the position that observations could never bear in any way on a theory that 
entails that every possible observation is made at least once in the history of the universe. 
But this conclusion would be a disaster, since cosmologists have considered a multitude 
of serious hypotheses with exactly this feature. If empirical investigation can never favour 
some of these hypotheses over others, we are doomed to a paralysing level of scepticism. 

These examples also remind us — if it was not clear enough already — that we need to 
think seriously about what it even means to say that our observations are ‘atypical’. For 
given Boltzmann’s theory, every possible observation is made not just once, but infinitely 
many times. The cardinalities are the same: there is a countable infinity of observations just 
like ours, and a countable infinity of observations unlike ours. So in what sense is it true to 
say that ‘most’ or ‘almost all’ observations are unlike ours? One could try to make sense 
of the ‘most’ claim by taking some kind of limit using a sequence of longer and longer 
finite temporal intervals. But what could make this the right way to compare the infinite 
sets? And how are we supposed to generalise it to more recent infinite-population theories 
which are set in relativistic spacetimes whose extent may be infinite in both temporal and 
spatial directions? 

Quite apart from problems associated with infinity, the typicality or otherwise of our 
observations clearly depends on the class of objects you are considering: a feature that is 
typical among primates need not be typical among all animals. But what is the relevant 
class of things when we are trying to figure out whether our observations are, or are not, 
problematically atypical according to a certain theory? 

In what follows we will develop a Bayesian framework for answering all these questions. 
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20.2 Bayesian Background 


We will assume that an ideally rational person at any time f¢ has degrees of belief 
(‘credences’) which can be represented by a probability function C(-|E;): the result of 
conditionalising a certain other probability function C — her ‘priors’ — on the total evidence 
E, that she has at f. In this paper, “probability functions’ will always be functions P from 
propositions — things that can be true or false, and believed to various degrees — to real 
numbers, such that 


(i) P(A) = | whenever A is logically necessary 

(ii) P(A) < P(B) whenever B is a logical consequence of A; and 
(iii) P(A V B) = P(A) + P(B) whenever A and B are logically incompatible. ! 
We take the set of propositions on which rational peoples’ credence functions are defined 
to include both eternal, qualitative propositions (like There exists or will exist or has existed 
at least one physicist) and self-locating propositions (like There is a physicist in front of 
me) which attribute a qualitative property to the particular agent at the particular time in 
question.” 

Note that we are merely assuming that rational people can be so represented, not that 
they need to have priors in their heads in any psychologically realistic sense — let alone 
that they need to have had them in their heads temporally prior to any given episode of 
rational belief-formation. We are also not assuming any particular account of evidence, 
e.g. that only propositions about one’s conscious experience at f can be part of one’s evi- 
dence at t. Everything we say will be compatible with externalistic views on which a rich 
body of truths about one’s surroundings and history count as part of one’s evidence (e.g. 
Williamson, 2000). While the difference between these conceptions of evidence can make 
a big difference in some cases, we do not think it will matter to any of the theoretical 
comparisons we will be concerned with in this paper. 

One advantage of characterising a rational person’s credences as the result of condi- 
tionalising her priors on her total evidence at the relevant time, rather than as the result 
of conditionalising her previous credence function on the evidence that she just received, 
is that it determines, in a prima facie plausible way, how a rational person’s credences 
evolve when she forgets things, and how her credences evolve in response to changing 
self-locating evidence.* Most importantly, it gives us a setting in which we can pose ques- 
tions not just about how new evidence should modify our credences, but also about how the 


Note that these conditions entail that P(A) is never greater than | (since in that case P(AV > A) would also 
have to be greater than | by (ii), contradicting (i)), and that P(A) is never less than 0 (since in that case P(> A) 
would have to be greater than | by (i) and (iii)). Note too that (iii) immediately extends to all finite disjunctions: 
if Ay ... Ay are pairwise logically incompatible, P(A, V ... V An) = P(A,) +...+ P(An). By contrast with 
the standard mathematical definition of ‘probability function’, we do not require probability functions to satisfy 
countable additivity, the analogue of this for countably infinite disjunctions. 

Our purposes in this chapter will not require taking any particular view as to the nature of propositions in 
general, or of self-locating propositions in particular. Perhaps the qualitative and the self-locating propositions 
are just two special subclasses of the class of all propositions, the former being those that are not ‘directly about’ 
any particular objects at all and the latter being those that are ‘directly about’ the person and time in question 
but nothing else. Or perhaps self-locating propositions should be treated as sui generis, as in the influential 
approach of Lewis (1979). 

See Arntzenius (2003) for further discussion of these issues. 
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evidence that we already have bears on a given theory. It is vital to be able to make sense of 
this, since in many cases it is quite an achievement to extract any observational predictions 
at all from a theory, and often the predictions we manage to extract will concern observa- 
tions that we have already made. Such ‘old evidence’, whose relevance we want to be able 
to discuss, includes not just facts about experiments that have been done by physicists, but 
familiar facts of everyday life that are well known to everyone. In the present framework, 
even when a person has already updated her credences on certain evidence E, we can still 
say that E confirms H ‘for that person if her prior C is such that C(H|E) > C(#), and that 
E confirms H simpliciter if C(H|E) > C(A) for any reasonable C. 

We stated above that the credences of a rational person are defined on self-locating as 
well as qualitative propositions. In recent years the epistemology of self-locating belief has 
been the focus of a substantial body of literature (for a survey, see Titelbaum, 2013). That 
literature tends to focus on thought experiments involving perfect duplication of experience 
between different people, or between the same person at different times. This might suggest 
that the inclusion of self-locating propositions in the framework is a technical innovation 
driven primarily by such thought experiments. But the idea that there is a crucial difference 
between learning that you now have a certain property and merely learning that someone, 
sometime has that property is really a completely intuitive one, which can be illustrated by 
any number of everyday cases. For instance, suppose that you and 20 friends have booked 
all the rooms in a 21-room hotel. You remember either reading in the hotel brochure that 
all but one room in the hotel is red, or reading that only one room is red. You just do not 
remember which, and you are initially about 50-50 as to the type of hotel that you have 
booked. Upon arrival you and your friends randomly pick rooms to go to. You then find 
that your room is red. If you took your evidence to be the qualitative proposition Someone 
is ina red room you would have no reason to modify your credences regarding the type of 
hotel that you are in, since that proposition had to be true either way. But if your evidence 
is the self-locating proposition J am in a red room, it strongly supports the hypothesis that 
all but one room in your hotel is red. And it seems obvious that this is the correct way 
to reason. This is presumably what you would conclude if you believed that you were the 
only person in the hotel, and it surely makes no relevant difference whether you believe 
that you have 20 friends with you or not. 

Of course, we have claimed that one should form credences by conditionalising one’s 
priors on one’s fotal evidence at any given time, and it is unrealistic to think that J am in a 
red room is your total evidence. However, it is not unrealistic to think that this is the only 
part of your evidence that we need to take into account in order to assess how your evidence 
bears on the comparison between the two live hypotheses about what kind of hotel you are 
in. Formally, when E~ is a consequence of your total evidence FE, E~ will exhaust the 
bearing of E on two hypotheses H; and Hz whenever C(E|E~ H,) = C(E|E~ Az). For in 
that case, 


C(A\|E) C(EH;) = C(E|E- HW )C(E" HM) — C(E Hi) © CUMIE") 
C(A2|E)  C(EH2) ~~ C(E|E~H2)C(E~H2) ~—C(E~H2) ~——s C(AD|E~) 
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so that E~ can serve as a proxy for your total evidence E when assessing its bearing on 
these two hypotheses. In the present case, it is plausible that according to your priors, your 
total evidence in all its detail is just as likely on the assumption that you are in the only red 
room in the hotel as it is on the assumption that you are in one of 20 red rooms in the hotel, 
so that My room is red can serve as a proxy for your total evidence. 

In this example, the hypothesis All but one room in my hotel is red is not a quali- 
tative proposition, and it would obviously be crazy to think that your total qualitative 
evidence can serve as a proxy for your total evidence when the relevant hypotheses are 
self-locating propositions. But we could instead have focused on the qualitative hypothe- 
sis There exists a 21-room hotel with 20 red rooms. Given what you remember about the 
booking, you should clearly become more confident in this when you find yourself in a red 
room. 

It is clear that your total evidence is not a qualitative proposition. We need not assume 
that it is a self-locating proposition either (i.e. one that attributes a qualitative property to 
the agent): perhaps we should instead take it to be a ‘de re’ proposition like J am in this 
particular red room, which attributes to the agent a non-qualitative property involving a 
particular object. We will however be assuming that for the purposes of reasoning about 
qualitative and self-locating propositions, one’s total self-locating evidence can serve as a 
proxy (in the sense explained above) for one’s total evidence. Having absorbed the lesson 
that conditionalising on an existential generalisation often has very different effects from 
conditionalising on an instance of that generalisation, this assumption might seem implau- 
sible given the de re view of evidence — why would J am in a red room be any better as a 
proxy for J am in this particular red room than Someone is in a red room? But when we 
bear in mind that one’s self-locating evidence might include propositions like J am in a 
room that looks this highly distinctive way, or I am in a room that I remember having been 
in 20 years ago, it becomes hard to imagine a plausible view on which the mere identity of 
the particular objects in one’s environment would have any further capacity to discriminate 
among qualitative or self-locating hypotheses. 

Some theorists have taken seriously the ‘relevance-limiting thesis’ according to which 
only qualitative evidence needs to be taken into account when we are reasoning about qual- 
itative hypotheses.* Their idea for dealing with apparent counterexamples like our hotel 
case is to say that our total qualitative evidence is quite rich — not just Someone is in a red 
room, but Someone is in a red room experiencing such-and-such detailed pattern of light 
and shade, hearing such-and-such sounds, having such-and-such memories ... When we 
are reasoning about hypotheses according to which the population of the universe is small 


4 The label is due to Titelbaum (2013); defenders include Halpern and Tuttle (1993), Halpern (2004), Meacham 
(2008), and Neal (2006) (approvingly cited in Carroll, 2010, p. 401). In a somewhat similar vein, Hartle and 
Srednicki (2007, p. 1) claim that ‘Cosmological models that predict that at least one instance of our data exists 
(with probability one) somewhere in spacetime are indistinguishable no matter how many other exact copies of 
these data exist’, although their later work (Srednicki and Hartle, 2013, 2010) suggests that their view may be 
a ‘permissivist’ one on which ideal reasonableness permits, but does not require, the disposition to reason in 
such a way that one’s credences concerning the accuracy of such models never evolve under the impact of new 
evidence. 
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enough that it is vanishingly unlikely that there would be more than one person satisfying 
such a rich description, this lets us recover the reasonable-looking patterns of reasoning 
described above, at the cost of having to count all sorts of intuitively irrelevant aspects of 
one’s evidence as crucially relevant. For example, it is plausible that a reasonable prior 
credence function will count it as approximately 20 times more likely that someone will 
satisfy the above rich description conditional on there being 20 people in red rooms than 
conditional on there being only one person in a red room. But as soon as we start thinking 
about scenarios in which we can no longer reasonably neglect the possibility that there is 
more than one witness to the existential quantification that is our total qualitative evidence, 
the approach that looks only at qualitative evidence will start to generate distinctive and 
implausible results. For example, consider the following case: 


Measuring a Parameter: Two qualitative theories T1 and T2 both entail that the population 
of the universe is vast but finite. They differ with regard to the value of a certain cosmo- 
logical parameter a. T1 says that the true value of aw is 34.31, and T2 says that it is 34.59. 
Because of this, Tl and T2 also differ as regards the distribution of results among the many 
repetitions in the history of the universe of a certain experiment E which fairly reliability 
measures the value of a. Conditional on T1, the expected proportion of those who get the 
result 34.31 among those who do E is approximately 1/20, while conditional on T2, the 
expected proportion is approximately 1/1000. 


Intuitively, doing E and getting 34.31 very strongly favours T1 over T2. But if the pop- 
ulations are sufficiently large, our qualitative evidence will deserve high prior credence 
conditional on both T1 and T2, even if we are careful to include all manner of apparently 
irrelevant background details. Thus the view that only qualitative evidence matters in rea- 
soning about qualitative hypotheses leads to a disastrous scepticism about our ability to 
bring empirical evidence to bear in distinguishing different large-population hypotheses.» 
As if this were not bad enough, the relevance-limiting thesis faces the further problem 
that it requires a counterintuitive boost in the posterior probabilities of theories according 
to which the population is large relative to their prior probabilities, simply because the 
more people there are, the less unlikely it will be (according to a reasonable prior) that 
any given very detailed qualitative property has at least one instance. When combined 
with the approach’s inability to allow for evidence-based discrimination between these 
large-universe hypotheses, this threatens to lead to a truly paralysing sceptical collapse. 
One might still try to make a last gasp effort to make do only with purely qualitative 
evidence by adopting an ultra-fine-grained conception of evidence, on which your total 
qualitative evidence is the existential generalisation of a property so specific that you can 


5 Neal (2006) attempts to address this problem by (i) adopting a very fine-grained conception of evidence, on 


which the population of the universe would have to be quite large (he suggests something in the order of 1910!) 
for there to be a substantial chance that the qualitative property attributed by our total evidence has multiple 
instances; and (ii) proposing that we should simply ignore the possibility that the population is this large. This 
‘ignoring’ strikes us as patently unreasonable, in spite of the dubious verificationist and ethical considerations 
which Neal offers in its favour. (On the ethical ramifications of infinite populations, see Arntzenius, 2014.) 
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legitimately neglect the possibility that more than one person has it, no matter how many 
people there are. For example, the property might specify the exact values of certain con- 
tinuous parameters (perhaps having to do with one’s mental state), in which case it is 
plausible that reasonable priors will assign its existential generalisation probability zero, 
even conditional on number of people being (countably) infinite. Implementing this strat- 
egy would require an elaboration of the framework in which conditional priors are taken 
as primitive rather than defined by C(A|B) = C(AB)/C(B), so that one can meaningfully 
conditionalise even on propositions whose prior credence is zero (see Hajek, 2003). The 
question how one’s unconditional priors should be extended to allow for conditionalisa- 
tion on probability-zero propositions raises tricky issues (see Dorr, 2010; Myrvold, 2015). 
Indeed, the project of formulating rules for extending unconditional priors to conditional 
priors involves puzzles that are in many ways analogous to those that arise for the project 
of formulating rules for extending qualitative priors to self-locating priors. Given that the 
ultra-fine-grained conception of evidence has severe foundational problems — intuitively, 
it vastly overstates the extent to which beings like us could ever hope to get their beliefs 
to correlate with the exact value of any continuous parameter — we will set it aside, while 
noting that the ideas about self-locating priors which we will discuss in this paper will have 
analogues within the ultra-fine-grained framework. 

One final note: the Bayesian framework has often been combined with the idea that 
there are no rational constraints on priors beyond the probability axioms. This would make 
the present enquiry trivial. We will be taking it for granted that there are better and worse 
priors to have, and that factors such as simplicity can legitimately be appealed to in saying 
what makes the better ones better. This means that we accept that in certain senses of 
‘simple’, you should be pretty confident that the world is ‘simple’ in the absence of relevant 
evidence. Some will find this suspiciously rationalistic. They will be tempted to think that 
reasonable priors should instead be ‘unbiased’ or ‘uniform’. But making sense of a relevant 
notion of lack of bias or uniformity is extremely difficult (especially in the infinite case). 
And in the limited range of cases where one can make sense of it (e.g. when there are only 
finitely many possibilities or there is a unique measure having certain natural symmetries), 
such ‘uniform’ or ‘unbiased’ priors turn out to have the feature that you typically do not 
learn anything interesting about the world given any finite amount of evidence. In any case, 
legislating some notion of uniformity or lack of bias seems equally a prioristic to us. 


20.3 A Principle About Finite Worlds 


In our present state of understanding, it would be foolish to try to codify the features that 
make some priors more reasonable than others — simplicity and so forth — in the form of 
some precise collection of axioms. In general, we just have to muddle along as best we can 
by trusting our judgements about particular cases. However, there are a few special domains 
where we have the resources to formulate principles about what reasonable prior credence 
functions are like which go beyond the probability axioms, are not obviously false, and 
are precise enough to be worth arguing about. One of these domains is the epistemology of 
self-locating belief conditional on there being only finitely many observers. In this domain, 
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we can formulate a general principle which, if true, will allow one completely to specify a 
unique reasonable assignment of prior credences to self-locating propositions (conditional 
on the population being finite), given as input a reasonable assignment of prior credences 
to qualitative propositions. This principle can be seen as a very limited principle of indiffer- 
ence: the intuition is that your self-locating priors, conditional on a certain qualitative state 
of affairs, should be indifferent among all the observers who exist in that state of affairs.© 
Or better — since self-locating propositions may address the question when it is as well 
as who you are — your conditional self-locating priors should be indifferent among all the 
portions of the lives of observers in the relevant state of affairs whose duration is the same. 

To state this principle more rigorously, we will need to introduce some notation. Where 
P is a probability function and R is a real-valued random variable — which we can identify 
with a function that maps each real number x to a proposition “R = x’ in such a way that the 
resulting propositions are pairwise inconsistent and jointly exhaustive — we will use ‘P(RY’ 
to denote the expectation value of R in P. Similarly we write ‘P(R|H ) for the expectation 
value of R in P(-|H). When F and G are any properties, we write (F': G) for the random 
variable whose value is the ratio between the total, for all the things that are ever F, of the 
duration for which they are F and the total, for all the things that are ever G, of the duration 
for which they are G. For example, (physicist : philosopher) = 10 is the proposition that 
the total duration of all philosophers’ careers is positive and ten times smaller than the 
total duration across history of all physicists’ careers.’ Then our principle can be stated as 
follows: 


PROPORTION Where H is any qualitative hypothesis which entails that the total duration 
of the lives of all observers is positive and finite, and F is any qualitative property, and C 
is any reasonable prior credence function such that C(H) is positive: 


CU am F\|H) = CF observer : observer) |) 


In words: your prior probability that you are F given H should equal your prior expectation, 
given H, for the proportion of all observer time taken up by F. Note that the expectation 
value of this random variable depends only on how C treats qualitative propositions. Thus 
PROPORTION fully determines one’s self-locating priors (conditional on the total duration 
of the lives of all observers being positive and finite) as a function of one’s qualitative 
priors. 

To get a sense for the appeal of PROPORTION, let us return to the case of Measuring 
a Parameter. We can take it that Tl and T2 do not differ as regards the proportion of 


6 This basic thought is what Bostrom (2002) calls ‘The Self-Sampling Assumption’: ‘One should reason as if one 
were arandom sample from the set of all observers in one’s reference class’. PROPORTION below is intended as 
a precisification of this vague formulation. Our principle does not talk about ‘reference classes’: where Bostrom 
would ask ‘What is the right way of specifying the reference class?’, we will simply ask ‘What is the right way 
of defining ‘observer’?’ 

We leave it open what the relevant notion of ‘duration’ is: it might be taken to be the physical notion of 
proper time; subjective (psychological) time; some measure of complexity of evolution in the relevant system’s 
physical state; or something else again. 
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observers who ever do experiment E, or as regards how far along they are in their lives 
(on average) when they do it. In that case, the expected ratio between the total amounts of 
observer time occupied by the properties having done E and got 34.31 and having done E 
will be equal to the expected proportion of trials of E that yield the result 34.31 according 
to that theory. So according to PROPORTION, C(/ did E and got 34.31|T1) will be (approx- 
imately) 20 times greater than C(/ did E and got 34.31\T2) for any reasonable C. Thus, 
assuming that none of the other details of your total evidence beyond the fact that you did 
E and got 34.31 are relevant to the comparison between T1 and T2, we can draw a con- 
clusion about how the experiment should affect your credences: the ratio of your credence 
in T1 to your credence in T2 should be 20 times greater than it would have been with no 
relevant evidence. 

Ideas in the vicinity of PROPORTION are often summed up with slogans like “We should 
expect ourselves to be typical’. But such slogans need to be treated with care. If being a 
typical observer means having only those properties that most observers have, we should 
obviously expect not to be typical; indeed we should be confident that everyone has some 
properties that most observers lack. For the same reason, if being a typical observer means 
something like being an average observer, we should also expect not to be typical. To 
extract from PROPORTION the claim that we should be confident that we are typical 
observers, we need to devise an interpretation for “typical observer’ on which it is trivially 
true that if there are finitely many observers, most of them are typical. If we were con- 
cerned only with typicality in one particular quantitative respect, we could simply define 
‘typical’ to mean ‘having a value for the relevant quantity that is within so many standard 
deviations of the mean value among all observers’. But making sense of an ‘all things con- 
sidered’ notion of typicality is a much harder task, and not one that we have any need to 
take on. 

Let us turn next to a more controversial application of PROPORTION. 


Brains or No Brains? Two theories Brains and No Brains are generally similar except 
that Brains predicts that, after the heat death of the universe, an enormous (but still finite) 
number of observers come into existence because of random vacuum fluctuations, while No 
Brains includes some mechanism that prevents this from happening. Most of the randomly 
produced observers that exist according to Brains are ‘Boltzmann Brains’ — i.e. things that 
just barely qualify as ‘observers’ in the relevant sense, however we end up cashing it out. 
Although almost all the Boltzmann Brains are short-lived, Brains predicts that there are so 
many of them that the total duration of their lives is much larger than the total duration of 
all the ordinary observers’ lives. And while a few of the Boltzmann Brains will, by chance, 
have misleading experiences as of living in a world like ours, fake memories, etc., almost 
all of them will spend their entire lives enduring the rather unpleasant sorts of experiences 
one would expect given the inhospitable conditions in which they have come into existence. 


Consider our current evidence — evidence, perhaps, as of sitting in a comfortable room, 
drinking a cup of tea while typing on a computer keyboard. Brains and No Brains differ 
radically with regard to the proportion of observer time occupied by this qualitative prop- 
erty. So according to PROPORTION, a reasonable prior credence function will assign this 
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evidence much higher probability conditional on No Brains than conditional on Brains, 
so that the evidence counts heavily in favour of No Brains. Of course, this does not yet 
mean that we should be much more confident in No Brains than in Brains. This confi- 
dence depends on the priors for the two theories as well as on the evidence, and No Brains 
might have features in virtue of which it deserves a much lower prior credence — e.g. if the 
mechanism that prevents Boltzmann Brains from forming is an ad hoc postulate without 
any further motivation, it might detract greatly from No Brains’s simplicity. However, the 
larger the ratio of Boltzmann Brains to normal observers is according to Brains, the harder 
it will be to justify an asymmetry in the priors large enough to compensate for the force of 
the evidence.® 

Brains or No Brains? shows that PROPORTION has some distinctive and controversial 
implications when combined with other plausible claims about reasonable prior credences. 
In that example, the required additional claim was to the effect that if theories are roughly 
similar in respect of simplicity, etc., they should not be assigned very different prior 
credences. In other examples, the additional claim that combines with PROPORTION to 
generate controversial implications is one that is often taken for granted in applications 
of Bayesian methods: namely, that conditional on the hypothesis that a certain function 
is the one that maps each proposition to its objective chance of being true — its physical 
probability — our prior credences should agree with that function. 


(PP) C(A|P is the objective chance function) = P(A) whenever defined. 


(This is one version of the ‘Principal Principle’ from Lewis, 1980.) The combination of 
(PP) with PROPORTION makes for distinctive consequences when we are dealing with 
theories according to which are significant objective chances for substantially different 
total numbers of observers. For example, consider the following case from Bostrom (2002): 


The Incubator Stage (a): The world consists of a dungeon with one hundred cells. The 
outside of each cell has a unique number painted on it (which cannot be seen from the 
inside); the numbers being the integers from | to 100. The world also contains a mechanism 
which we can term the incubator. The incubator first creates one observer in cell #1. It 
then. .. flips a fair coin. If the coin falls tails, the incubator does nothing more. If the coin 
falls heads, the incubator creates one observer in each of the cells ##2—100. Apart from this, 
the world is empty. It is now a time well after the coin has been tossed and any resulting 
observers have been created. Everyone knows all the above. 


Stage (b): A little later, you have just stepped out of your cell and discovered that it is #1. 
(Bostrom, 2001, 363) 


Let S be the qualitative description of this set-up, supplemented with the stipulation that all 
observers created by the incubator live equally long lives; let H and T, respectively, be the 


8 Of course, we do not have to rest content with the evidence we can get by sitting in our armchairs: we can 
also go out and do some experiments. However, unless these experiments have an incredibly strong ability to 
discriminate the theories, they will not change the epistemic situation very much. 
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conjunctions of S$ with the propositions that the coin lands Heads and that the coin lands 
Tails. Since S specifies that the chances of H and T are equal, (PP) says that C(H|S) = 
C(7T|S) = 1/2 for any reasonable C. Since H entails that there are exactly 100 equally 
long-lived observers of whom one is in a cell numbered #1, while T entails that all observer 
time is spent in a cell numbered #1, we have C((observer in cell #1 : observer)IH) = 1/100 
and C((observer in cell #1 : observer)IT) = 1. So by PROPORTION, CU am in cell #1|H) = 
1/100 while CU am in cell #1\T) = 1. We can thus apply Bayes’s theorem to the probability 
function C(-|S) to get 


Cd am in cell #1\H)C(A|S) 
Cd am in cell #1|S) 
Cd am in cell #1|H)C(A|S) 
~ CC am in cell #1\H)C(A|S) + CU am in cell #1|T)C(T|S) 
0.01 x 0.5 1 
~ 0.01x0.5+1x05 101 


C(A|I am in cell #1 A S) = 


So if we assume that at stage (a) you have no relevant evidence beyond S, and that you 
gain no relevant evidence at stage (b) beyond the proposition that you are in cell 1, your 
credence in Heads will decrease from 1/2 at stage (a) to very low at stage (b). 

In thinking about cases like The Incubator, some have been attracted to an alternative 
view according to which your credence in Heads should be 1/2 at stage (b). On the less 
plausible version of this view, your credence should also be 1/2 at stage (a). But this is 
hard to take seriously, since the discovery that you are in cell #1 looks like strong evidence 
in favour of Tails (which entails it).? On the more plausible version of the view, your 
credence in Heads should be high at stage (a), so that it can be 1/2 even after the impact of 
this strong evidence. 

Given our Bayesian framework, there are two ways to generate this high credence in 
Heads at stage (a): we could either revise PROPORTION in such a way that your evidence 
at stage (a) will count as heavily favouring Heads, or we could revise (PP) in such a way 
as to require reasonable priors to favour Heads over Tails. One natural thought that would 
motivate the relevant sort of revision to PROPORTION is the idea that the more observers 
there are, the less surprising it is that you are one of them, so that the self-locating proposi- 
tion that you exist (or that you are an observer) should count as strong evidence for Heads 
(Bartha and Hitchcock, 1999). This is inconsistent with PROPORTION, which entails that 
your prior credence that you exist and are an observer (conditional on the total duration of 
observer time being positive and finite) should be 1. If we wanted existence or observer- 
hood to have evidential force, we could easily modify PROPORTION so as to concern not 
your prior unconditional credences, but your prior credences conditional on your existence 


9 If we add the stipulation that all the observers have exactly the same evidence at stage (a), the claim that 
your credence should not change between stage (a) and stage (b) follows from the relevance-limitation thesis 
discussed in Section 20.2. There are also some, such as Bostrom (2002), who are sympathetic to this claim but 
not to the relevance-limitation thesis. 
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or observerhood. The problem with this approach is that it is not really general enough. 
Deriving the judgement that your credence in Heads at stage (b) should be 1/2 in all vari- 
ants of The Incubator that differ just with regard to the two population numbers associated 
with Heads and Tails would require a prior credence function in which the probability of 
Iam an observer conditional on There are n observers increases linearly in proportion to 
n. And of course this is impossible, since conditional probabilities cannot exceed |. The 
better option, then, is to keep PROPORTION while revising (PP), by building into the priors 
a proportional bias towards hypotheses according to which there are many observers, even 
when chances are equal.!° 

Those who favour a high credence in Heads at stage (a) in The Incubator will presumably 
take an analogous view about other comparisons between hypotheses that disagree about 
the total number of observers. For example, in Brains or No Brains?, they will hold that 
your prior credence in Brains conditional on Either Brains or No Brains is true and Iam an 
observer should be very high, so that the posterior credences in Brains and No Brains given 
normal evidence (e.g. as of sitting drinking tea and typing) end up close. But considered as 
a general model for good reasoning about the population of the universe, this seems quite 
crazy. Consider our current state of ignorance as regards how hard it is for intelligent life to 
evolve in an arbitrary solar system. When combined with a cosmological theory according 
to which spacetime as a whole is finite (but very large), different answers to this question 
will generate radically different expected numbers of total observers. The “bias towards 
high populations’ idea will thus lead us to the absurd result that we should right now be 
confident, conditional on the universe being finite, that it is very easy for intelligent life to 
evolve — probably easy enough that even when we conditionalise on the (by the lights of 
this approach surprising) fact that we have not yet encountered any alien life, we should 
still be confident that the average galaxy contains many inhabited solar systems. While 
this ‘abundant life’ hypothesis is not itself unreasonable, it seems clear that there are also 
perfectly reasonable hypotheses on which life is far rarer than this, and that in our current 
state of ignorance, it would be quite unreasonable to assign a very low credence to these 
hypotheses. !! 

Some have argued that PROPORTION itself should be rejected on the grounds that it 
makes it easier than it should be to do astrobiology from the armchair. For example, Sean 
Carroll argues as follows against the claim that “we should make predictions by asking 
what most observers would see’: 


Imagine we have two theories of the universe that are identical in every way, except that one predicts 
that an Earth-like planet orbiting the star Tau Ceti is home to a race of 10 trillion intelligent lizard 
beings, while the other theory predicts there are no intelligent beings of any kind in the Tau Ceti 
system. Most of us would say that we do not currently have enough information to decide between 


10 For further discussion of this strategy, including the details of the required modification of (PP), see Arntzenius 
and Dorr (MS). 

'l Our argument here echoes Bostrom’s ‘Presumptuous Philosopher’ argument against what he calls the ‘Self- 
Indication Assumption’ (Bostrom, 2002). 
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these two theories. But if we are truly typical observers in the universe, the first theory strongly 
predicts that we are more likely to be lizards on the planet orbiting Tau Ceti, not humans here on 
Earth, just because there are so many more lizards than humans. But that prediction is not right, so 
we have apparently ruled out the existence of many observers without collecting any data at all about 
what is actually going on in the Tau Ceti system. (Carroll, 2010, p. 225) 


Carroll does not tell us what the two theories in his example say about life other than on 
Tau Ceti and Earth. This matters when we apply PROPORTION: if both theories entail there 
being quadrillions of observers elsewhere, and say the same things about how many of 
those observers are likely to have the qualitative property ascribed by our total self-locating 
evidence, the lizards will make no appreciable difference. But perhaps Carroll was taking it 
for granted that both theories entail that every observer is either on Earth or on Tau Ceti. In 
that case, PROPORTION does entail that our armchair evidence counts strongly against the 
lizard theory. If our priors are not biased towards high populations, and the theories really 
are on a par in the other relevant respects, we should be confident that the lizard theory 
is false, just as in The Incubator we should be confident that we are alone in the universe 
when we know that we are in cell #1.!? 

If, like Carroll, you think this result is wrong, it is worth trying to get clear on what it is 
about the lizards that is driving your judgment. Consider a range of theories: 


TO In any given solar system there is a certain tiny chance € for life to evolve at all, but 
if observers do come into existence they are likely to be human-like creatures whose 
DNA uses two base pairs, having five toes on each foot. 

T1 In any given solar system, the chance of human-like life evolving is €, but the chance 
of lizard-like life evolving is 10, 000e. 

T2 In any given solar system, the chance of human-like creatures with five toes on each 
foot evolving is €, while the chance of human-like creatures with six toes on each foot 
evolving is 10, 000e. 

T3 In any given solar system, the chance of human-like creatures whose DNA uses only 
two base pairs is €, while the chance of human-like creatures whose DNA uses three 
or more base pairs is 10, 000e. 


Suppose that all the theories are on a par as regards simplicity, and agree that the number 
of solar systems is finite but far greater than I /e. 

According to PROPORTION, our actual evidence (as of being human-like, with five toes 
and two base pairs) strongly supports TO over all of T1—-T3. In the case of T3, this result 
is quite intuitive. Suppose we had known about TO and T3 and their predictions before 
investigating our DNA: then, surely, the discovery that we have two base pairs would have 
strongly favoured TO. The situation seems similar in every relevant respect to Measuring a 


12 Hartle and Srednicki (2007) make a similar argument about aliens living in the atmosphere of Jupiter. They 
argue that it would be unreasonable to reject a theory according to which there are many such aliens ‘solely 
because humans would not then be typical of intelligent beings in our solar system’. Of course we agree with 
this, but we note that the theories in their example are not described in enough detail — in particular, with 
regard to what they say about life outside the solar system — to determine what PROPORTION says about them. 
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Parameter. Perhaps there is some temptation to think that the situation with T2 is different, 
given that we have always known how many toes we have. But how could that difference 
matter? The order in which we get evidence does not normally have much significance 
as regards what we should believe once we have the evidence; the mere fact that the toe- 
counting experiment was performed long ago is not in itself any kind of reason to discount 
its significance. This takes us back to T1 and the lizard beings. Carroll does not tell us 
very much about what is driving his judgement about them. Is it their scaly skin? Their 
cold blood? Their bizarre social structures? Their alien sensory experiences? We have the 
feeling that as the scenario is fleshed out in more detail, the sense that there could be any 
important difference between T1 and T2 will start to fade away. 

The ways of fleshing out T1 that make it most plausible that we should think about it 
differently from T2 and T3 involve the lizard beings having a mental life that is in some 
deep respect very different from ours. But in these versions of the case it is no longer 
obvious what PROPORTION says, because it is no longer clear whether the lizards count as 
‘observers’. So far we have been treating PROPORTION as a single univocal principle; but 
we should now admit that as we are conceiving it, it is really a schema that can be filled in 
in many different ways depending on how one interprets ‘observer’. Some of the instances 
of the schema have crazily counterintuitive consequences. For example, if we plug in “five- 
toed biped’ for ‘observer’, we will get the absurd result that no amount of evidence should 
shake your confidence that you are a five-toed biped, conditional on there being a positive 
finite number of five-toed bipeds. If we understand ‘observer’ as ‘living being or rock’, 
we will end up with the absurd result that our actual evidence heavily favours hypotheses 
according to which the ratio of rocks to living beings is low over hypotheses that according 
to which it is high. We can thus refine our understanding of how the schematic notion of 
an ‘observer’ should be understood by considering our judgements about such cases. But 
there will still, inevitably, be hard cases where the intuitions are unclear. 

Some friends of PROPORTION might hope to draw some principled, sharp line between 
observers and non-observers.!? We do not think this can be done, but we also do not think 
that this is a problem for the basic thought underlying PROPORTION. Given that all the 
relevant factors are continuous, we should probably allow reasonable prior credence func- 
tions that blur the line in one way or another — for example, by assigning real-valued 
“degrees of observerhood’ which one integrates over time, instead of simply looking at the 
duration for which a single property is instantiated, and/or by taking a weighted average 
of many different probability functions each of which obeys PROPORTION on a different 
conception of observerhood. Probably, too, we should be somewhat permissive, allowing 
different reasonable prior credence functions corresponding to different ways of cashing 
out observerhood. Perhaps, if Carroll’s lizards are sufficiently mentally alien, they may be 
among the things that can reasonably be excluded altogether, or assigned an observerhood 


13 A natural thought for those inclined towards some kind of mind-body dualism is that the sharp line in question 
is the one between consciousness and lack of consciousness (see, e.g. Page, this volume). 
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score much less than that of humans, or excluded by most of the probability functions that 
enter into the final weighted average. 

The general dialectical situation here is quite similar to the situation we are in in con- 
nection with the idea that reasonable prior credence functions favour simpler theories over 
more complex ones: there are many different ways of making the notion of simplicity 
precise, and we doubt that there is any uniquely natural, rationally compulsory way of 
measuring simplicity and taking it into account. Things are messy, but this should not stop 
us from trusting our judgements about particular cases where it is relatively obvious how 
the simplicity comparisons pan out. The appropriate methodology for working out how 
simplicity relates to reasonableness seems to be one of ‘reflective equilibrium’; insofar 
as we are drawn to something like PROPORTION, our attitude about what to count as an 
observer should be worked out in the same way. 

Fortunately, the details about what counts as an observer for the purposes of PROPOR- 
TION do not seem to be relevant to any of the theoretical comparisons that have arisen in 
physics. For example, in realistic ways of filling in the details of Brains or No Brains?, 
it will not actually matter whether we count disembodied brains as observers when calcu- 
lating the relevant proportions. We will get almost the same results if we only count fully 
embodied creatures, or creatures that live lives long enough to realise certain cognitive 
capacities, or creatures that achieve certain kinds of interaction with their environments. 
In each case, Brains still will entail (a) that the total duration of the lives of the post-heat- 
death ‘observers’ is far greater than the total duration of the pre-heat-death ‘observers’, 
and (b) that the qualitative property ascribed by our total evidence takes up a much lower 
proportion of post-heat-death observer time than of pre-heat-death observer time. 

One further point: if we were only concerned with the question how we should respond 
to some change in our evidence, taking for granted that our credences before the change 
are reasonable, there would be no reason to concern ourselves with the question what to 
count as an observer. For given certain very weak assumptions, we can use PROPORTION 
to derive a formula which specifies our new credences conditional on some H in terms of 
our old credences conditional on H, in which the notion of observerhood does not appear 
at all except in delimiting the possible values of H: 


PROPORTIONAL UPDATE: When a reasonable person has evidence J am E and credence 
function C, at t, and evidence J am E* and credence function C+. at tt, and Hisa qual- 
itative proposition that entails that the total duration of observer time is finite and that the 
total duration of E is positive if the total duration of E* is, 


C +. 
C,4 (lam F|H) = SE Sie) 
C,((Et : E)|H) 


whenever the right-hand side is defined. 


To derive PROPORTIONAL UPDATE from PROPORTION, the only assumption we need to 
make about the property of observerhood is that it is entailed by both E and E*.Begin with 
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a definition and two preliminary observations. Definition: when R and S are two random 
variables, let R x S be the random variable such that for any non-zero x, R x S = x is the 
disjunction of all conjunctions R = y A S = z where yz = x, while R x S$ = O is just R= 
0 Vv S = 0 (and thus can be true even when one of R and S is undefined). First observation: 
given that E and E* entail observerhood and that H entails that E has positive duration 
whenever E* does, H entails that (FE* : observer) = (FE*+:E) x (E: observer). Second 
observation: PROPORTION yields the following expression for the posterior expectation of 
any qualitative random variable R conditional on H: 


CR x (E: observer) |H) 
CUE : observer) |) 


C(R|lam E A H) = 


This gives us what we need to establish PROPORTIONAL UPDATE: 


C(.am F and E+|H) _ C((FE* : observer) |H) 

Cam E*|H) ~~ C((E+ : observer) |H) 
__ C((FE* :E) x (E:observer)|H) — C((FE+:E)\lam EAH) _ C,((FE+:E)|H) 
~ @((E+:E) x (E:observer)|H) = C((E+:E)\lamEAH)  C,((E+:E)|H) 


C4. am F\H) = 


where the equalities are justified respectively by the fact that E* is your evidence at ¢*, by 
PROPORTION, by the first observation, by the second observation, and by the fact that EF is 
your evidence at f. 


PROPORTIONAL UPDATE: in turn yields a rule that we can use in the same way that peo- 
ple standardly use Bayes’s rule, to express how much a change in evidence favours one 
qualitative hypothesis over another (when the conditions of PROPORTIONAL UPDATE are 
met):!4 . 

CH) — C,(E":E)|Ai) CA) 

Cy+(Ha) — C,((EF: E)|H2) CH) 
The question what counts as an observer for the purposes of PROPORTION can thus be 
bracketed when we are only concerned with assessing the impact of new evidence. How- 
ever, unlike many Bayesians, we are interested in questions about synchronic rationality 
(what credences are reasonable given certain evidence) as well as diachronic rationality 
(how one’s credences should evolve given certain changes in one’s evidence, assuming 


they were reasonable to begin with). So we do not take this as a dissolution of the question 
15 


what counts as an observer. 


14 Proof: Cr4(H1)/Cr4. (Aa) = Cr (Lam such that H\|H, V H>)/Cr4( am such that Hy\H, V Ho) = 
C,((Et-such-that-H : E)|H) V H2)/C;((E+-such-that-H : E)|H} V Hz) (by PROPORTIONAL UPDATE) = 
(Cr(E* : B)|Hi)CrHi|Hy V H2))/(C,((E* : E)|H)Cr(Ha|Hy V Ha) = CE: EH) CH)/ 
C,((E™ : E)|H2)C;(Ap). 

15 Garriga and Vilenkin (2008) also note that for the purposes of assessing the impact of new evidence, there 
is no need to talk about any ‘reference class’ other than the one given by the old evidence. They seem to be 
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In conclusion, it seems to us that once we modify PROPORTION so as to remove the 
suggestion that there is a unique, binary notion of observerhood that all reasonable prior 
credence functions have to respect, the result is an attractive principle that yields defensible 
results across a wide range of cases. If we only ever had to think about finite populations, 
the simplicity and strength of this principle combined with the plausibility of its conse- 
quences would constitute good grounds for accepting it. However, we cannot reasonably 
assign a credence of zero to the hypothesis that there are infinitely many observers. And 
given this, we should surely want whatever we say about the epistemology of self-locating 
belief in finite worlds to emerge as a special case of some more general epistemological 
theory that also has something to say about infinite worlds. Thus, we will have to inves- 
tigate the infinite case before forming a final view about PROPORTION. In the remaining 
sections we will make a start on this project. 


20.4 Infinite Populations 


Let us begin with some especially straightforward infinite-world hypothesis, where there is 
an obvious, uniquely natural way of generalising PROPORTION-style reasoning by taking 
limits. 


Chessboard: Black and white houses are arranged on a two-dimensional plane, in a chess- 
board pattern. At a certain time, one person is born in each house, and lives the next 20 
years inside that house. At that point all the doors of the houses are unlocked, and the 
people get to leave their houses, see what colour they are, and explore their immediate 
neighbourhood. Sixty years later, everyone dies. No other living creatures ever exist. 


What prior credence should you have that you were born in a black house, conditional 
on Chessboard? Or equivalently: if you are still locked inside your house, how confident 
should you be, conditional on Chessboard, that you will find it to be black when you get 
let out? The natural answer is 1/2. 

It is sometimes suggested that there is a deep conceptual problem about endorsing this 
natural answer. Perhaps the thought is that in claiming that your credence that you are in 
a black house conditional on Chessboard should be 1/2, we are somehow forgetting about 
the fact that it is not true to say that the proportion of observers who are in black houses if 
Chessboard is true is 1/2 (or any other number), since there is no such thing as the ratio of 
infinity to infinity. But this assumes that claims about proportions provide the only possible 
basis for favouring some credences over others in this case. We see no grounds for any such 
assumption. 

Finding a fully general principle that entails the natural answer concerning Chessboard 
is a very tall order. But we can take a step in that direction by formulating a principle that 


interested only in what we called ‘diachronic rationality’, and thus take their method to be a full solution to 
the problem of defining observerhood. PROPORTIONAL UPDATE is an improvement on the updating method 
they describe, which does not take account of the time-relativity of evidence. 


Self-Locating Priors and Cosmological Measures 413 


tells us how to assign prior credences to self-locating propositions conditional on hypothe- 
ses like Chessboard, which describe approximately static arrangements of observers in a 
fixed background space. 


LIMITING PROPORTION: Suppose that F is some qualitative property, and H is a qual- 
itative proposition that entails that every finite region only ever contains finitely many 
observers each of whom has a finite life, and that 


(i) There is a certain real number x such that, for any all-encompassing nested sequence 
of concentric spheres 01, 02, 03 ..., xis the limit of the sequence x1,.x2,x3...., where x; 
is the proportion of the total duration of the lives of observers whose lives are confined 
to o; during which they are F. 

(ii) Observers do not move around too much: there is a finite upper bound to the lengths 
of the journeys they take over the course of their lives.!® 


Then C(U/ am F|H) = x for any reasonable prior credence function C for which C(H) is 
positive. 


(We call a sequence of regions ‘all-encompassing’ just in case its union is the entire space.) 

One might worry that there is something objectionably arbitrary about using the family 
of orderings of observers generated by the nested spheres to set a constraint on priors. 
After all, so long as the cardinality of F observers is the same as the cardinality of non-F 
observers, one can find orderings of the observers in which the limiting proportion of F 
observers takes any value one pleases. However, in general, the definition of one of these 
competitor orderings — or of a family of such orderings that agree on the limiting proportion 
of F observers — will be far more complicated than the definition of the family of orderings 
generated by nested sequences of spheres. For example, in the case of Chessboard the 
sequences of observers in which the limiting frequency of observers in black houses is 
anything other than 1/2 are, intuitively, quite crazy, jumping around in ever-larger leaps 
with no discernible logic beyond the imperative to make the limiting frequency come out 
at a specified value. Thus, insofar as one is comfortable with the idea that considerations 
of simplicity can play a legitimate role in making a difference between reasonable and 
unreasonable priors, it is hard to see how there could be any deep problem with the thesis 
that prior credences based on taking limits in nested spheres are more reasonable than prior 
credences based on taking limits using some ordering that gives a limiting proportion other 
than 1/2. 

Someone might object that our judgement that the nested-sphere-based orderings are 
simpler than the jumpy orderings that give different limiting frequencies is a merely ‘rel- 
ative’ one. The notion of a sphere is defined in terms of a certain metric; but given any 


16 Given this condition, we will get the same limiting proportion whether we look, for each sphere, at the 
observers whose lives are confined to that sphere; or at the observers whose lives overlap that sphere; or at the 
portions of the lives of observers spent in that sphere. These limits can come apart in far-fetched possibilities 
where the observers move about at unbounded speeds. We consider the puzzles raised by such possibilities in 
Arntzenius and Dorr (MS). 
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ordering of the observers, one could always define a new metric according to which that 
ordering counts as being derived from a sequence of nested ‘spheres’. Relative to the new 
metric, the sequences that looked well behaved relative to the old metric will make arbitrar- 
ily large jumps. But we do not see any problem here. There are very important differences 
between the real metric — the one that matters in physics — and the cooked-up quantities 
relative to which the crazy orderings look simple, and these seem like exactly the sorts 
of differences we should expect reasonable people to be sensitive to. Moreover, since just 
about any theory can be made to look simple by expressing it in a language with appropri- 
ately cooked-up vocabulary, it is hard to see how considerations of simplicity could play 
any substantive role in an epistemology that gave no role to the contrast between natural 
quantities and cooked-up ones. !7 

A different way of motivating the claim that there is a special conceptual problem about 
infinite populations comes from the idea that reasonable prior credence functions should 
be permutation-invariant, in the following sense: 


PERMUTATION-INVARIANCE: When H entails that every observer bears R to exactly one 
observer and that exactly one observer bears R to every observer, C(/ am FlIH) = C(I bear R 
to someone who is F\A) for any reasonable prior credence function C for which C(H) > 0. 


PERMUTATION-INVARIANCE is inconsistent with LIMITING PROPORTION. Suppose we 
define an ordering of all the observers with no first or last member, in which every third 
observer is in a black house and the rest are in white houses. If we let R be the relation 
every observer bears to the next observer in this ordering, PERMUTATION-INVARIANCE 
entails that CU am in a black houselChessboard) = C(I bear R to someone in a black 
houselChessboard) = CUI bear R to someone who bears R to someone in a black housel 
Chessboard). Since Chessboard entails that every observer falls into exactly one of these 
categories, C(I am in a black housell am an observer and Chessboard is true) must equal 
1/3 (if it is defined at all). But for the same reason, since we can define periodic orderings 
of the observers corresponding to any rational number between 0 and 1, PERMUTATION- 
INVARIANCE also entails that CU am in a black housell am an observer and Chessboard 
is true) is either ill defined or equal to x for every other rational x € (0, 1). This requires 
that either C(Chessboard) is zero, or CUI am an observer\Chessboard) is zero. Since the 
same reasoning will apply to other infinite-world hypotheses, the upshot is that we should 
be sure that the number of observers is not infinite. We take this to be a decisive reason to 
give up PERMUTATION-INVARIANCE. 


20.5 Infinite Worlds with Multiple Simple Measures 


LIMITING PROPORTION is not applicable to most of the infinite-population hypotheses that 
arise in the context of cosmology. The reason for this is that in relativistic spacetimes there 


17 For more on naturalness and simplicity, see Lewis (1983) and Dorr and Hawthorne (2013). 
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is no useful notion of a ‘four-dimensional sphere’ — the closest analogues of spheres are 
regions bounded by hyperboloids, but these regions will in general contain infinite numbers 
of observers and hence be useless for the purpose of taking limits. One possible response 
to this limitation would be to embark on a quest for a generalisation of LIMITING PRO- 
PORTION: a single natural rule which prescribes reasonable prior self-locating credences 
conditional on any infinite-world hypothesis that physicists are likely to take seriously. But 
merely formulating a principle at this level of generality, let alone arguing for its truth, 
would be a very difficult task. 

We do not think this is the right way to go. The moral we want to draw from the previous 
section’s qualified defence of LIMITING PROPORTION is not even that LIMITING PRO- 
PORTION is true without exception, but that those who are comfortable with the idea that 
considerations of simplicity play a role in making the difference between reasonable and 
unreasonable priors face no special conceptual problem when it comes to infinite worlds. 
In typical cases where the method of taking limits in nested sequences of spheres yields 
well-defined self-locating priors, it is also far simpler than any method yielding different 
results. But this is not always the case; and in cases where there are simple self-locating 
probability functions that disagree with those yielded by the method of nested spheres, 
the claim that reasonable priors must accord with that method is much more tendentious. 
Consider: 


Uneven Road: Inhabited houses are arranged along an infinite road running east—west. At 
one point there is a wall across the road. To the west of the wall, the houses are 100 metres 
apart; to the east, they are 10 km apart. 


How confident should you be, conditional on Uneven Road, that you are in the western 
(thickly settled) part of the road? LIMITING PROPORTION entails that your credence should 
be 100/101, since this is the limiting proportion of observers to the west of the wall in any 
all-encompassing sequence of nested concentric spheres. But this is not the only principled 
answer that could be given in this case: there is also some temptation to think that your 
credence should be 1/2. This will seem natural insofar as one is gripped by the thought 
that the spacing of the houses is rationally irrelevant. And this answer can also be gener- 
ated by a reasonably simple (albeit rather less general) method of assigning probabilities 
to self-locating propositions, namely a method which looks, not at nested sequences of 
concentric spheres, but at nested sequences of segments of the road in which each member 
of the sequence expands on its predecessor by adding the same number of houses in both 
directions. 

The prior credences prescribed by LIMITING PROPORTION in this case strike us as some- 
what dogmatic: finding that you do not live in a densely packed region does not seem like 
very strong evidence against Uneven Road. But the claim that you should be equally con- 
fident that you are in the western and eastern regions also seems unpromising — it is hard 
to see what plausible general principle could underlie such a prescription. We suggest 
that in cases like this, where there are multiple simple recipes for assigning credences to 
self-locating propositions conditional on some qualitative hypothesis, the most reasonable 
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approach is to split the difference. Conditional on such a hypothesis, reasonable prior cre- 
dences will be generated by taking a weighted average of the credences that result from 
the different simple methods, in which the simpler ones get weighted more heavily. 

To make this talk of ‘recipes’ more precise: let a cosmological measure be a function 
je from qualitative properties to qualitative random variables, such that conditional on the 
proposition that z(G) is defined for at least one property G: 


(i) “(F) = 1 follows from Always, everything is F, and 
(ii) w(F) < u(F’) follows from Always, everything F is F' 
(iii) w(F) + w(F’) = u(F”) follows from Always, everything F” is either F or F" but not 
both 


Given a measure yz and any probability function P on qualitative propositions which 
assigns probability | to jz being defined, we can extend P to a self-locating probability 
function Pl“! simply by taking P!“!(I am F) to be P(u(F)). A self-locating probability 
function thus corresponds to the combination of a qualitative probability function and 
a cosmological measure. In these terms, our proposed ‘compromising’ approach can be 
stated as follows: 


COMPROMISE: For any reasonable prior C and sufficiently specific qualitative H, 


CU am F|H) = x wiC(ui(F)|) 


where jz; are simple measures which are well-defined according to H, and w; are weights 
summing to 1, generally higher for simpler j1;. 


The ‘sufficiently specific’ H should be, at the minimum, specific enough to settle, of each 
simple measure, whether it is well defined or not. More generally, the point is to focus on 
hypotheses that pin things down in enough detail that there is no controversy about what 
the qualitative priors should be conditional on them, so that all of the debate pertains to the 
self-locating priors. 

(We can be more precise about the restriction to ‘sufficiently specific’ H if we derive 
COMPROMISE from the following attractive account of the role of simplicity in reasonable 
priors: 


SIMPLICITY: A reasonable prior credence function C is a weighted average >; w;P; of 
self-locating probability functions P;, where w; is generally higher for simpler P;. 


Plausibly all or almost all of the weights should go to probability functions P; that are of 
the form Q;!“‘! for some qualitative probability function Q; and measure j1;. In this setting, 
the ‘sufficiently specific’ H mentioned by COMPROMISE can be characterised as those 
for which Q;(-|H) and Q;(-|H) are everywhere approximately identical whenever both are 
defined and Q; and Q; are simple enough to receive significant weight. If this condition 
is met, C(R|A) will be approximately the same as 0; (R|H) for any qualitative random 
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variable R; and so we have C(I am F|H) = C(lam F A H)/C(A) = 90, wiPi( am F A 
H)/C() = Y; wiOi(ui(F such that H))/C(A) © So, wiC(ui(F such that H))/C(A) = 
> wiC uPA.) 

COMPROMISE is especially plausible when we turn to hypotheses in which LIMITING 
PROPORTION does not apply, but where there are multiple other simple methods of assign- 
ing probabilities to self-locating propositions. In many such cases, the quest for a general 
principle which would privilege a particular simple method as the one corresponding to a 
reasonable self-locating prior credence function seems misguided. Consider: 


The Cliff: There is an infinite half-plane dotted with black, grey and white houses, ter- 
minated to the north by a straight, infinite cliff edge. The houses and their inhabitants 
get exponentially smaller and more tightly packed as we approach the cliff. The dis- 
tance between the centre of a house and the cliff is always a power of two (in metres). 
The black houses on the line 2” metres from the cliff are distributed randomly, in such 
a way that the expected number of houses in each segment of length / is equal to //2”: 
thus the average spacing between black houses on a given line is equal to the distance 
between that line and the cliff. The distribution of grey and white houses is determined 
by the distribution of black houses, as follows: for each black house, there is a grey 
house exactly halfway between it and the cliff, and for each grey house, there is a white 
house exactly halfway between it and the cliff. These are the only grey and white houses 
(see Figure 20.1). 


There are two simple ways of assigning probabilities to self-locating propositions con- 
ditional on The Cliff, which assign different probabilities to J am in a black house. One 
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Figure 20.1 The Cliff 
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approach assigns a probability of 1/3, based on the fact that each north-south line of houses 
contains three houses of which one is black. The other approach assigns a probability of 
4/7, based on the fact that on any given east—west line of houses, black houses are twice 
as common as grey houses, and grey houses are twice as common as white houses, so that 
(with chance 1) the limiting proportion of black houses along any east—west line is 4/7. For 
the same reason, the limiting proportion of black houses in any all-encompassing nested 
sequence of rectangular regions (all with finite populations) will also be 4/7.'8 

Instead of the compromising approach, one might suggest a permissivist approach 
according to which both of the simple self-locating probability functions correspond to 
optimally reasonable prior credence functions conditional on The Cliff. More generally, the 
thought would be each sufficiently simple measure can be used to generate a maximally 
reasonable prior credence function conditional on a given detailed qualitative hypothesis. 
(There is some pressure for those who take this permissivist view to allow that weighted 
averages of maximally reasonable prior credence functions are also maximally reasonable.) 
But this view is implausibly liberal. By modifying the details of The Cliff, one can make 
the two candidate credences as close as one pleases to | and 0 — just let each black house be 
the southernmost member of a very long series of non-black houses, and allow the density 
of black houses to increase by an arbitrarily large factor with each step towards the cliff.!? 
The extreme credence functions that assign J am in a black house probabilities close to 0 or 
1 in these cases seem clearly less reasonable than the weighted-average credence functions 
that assign intermediate credences — it seems absurdly dogmatic to treat either the discov- 
ery that one is in a black house, or the discovery that one is not, as incredibly weighty 
evidence against The Cliff. 

In endorsing COMPROMISE, we do not mean to commit ourselves to the strong claim 
that simplicity is the only factor relevant to setting the weights assigned in a reasonable 
prior. There may be some further conditions that measures need to meet in order to deserve 
any weight at all. Note that all the measures we have considered make reference to a notion 
of observerhood, something which — as we discussed in connection with PROPORTION — 
could be understood in many different ways. A flat-footed extension of the compromising 
approach to the question what should count as an observer, according to which we simply 
take a weighted average of all probability functions which can be defined by appealing to 
simple criteria of observerhood — even crazy ones that count rocks as observers! — would 
have quite implausible consequences. Thus, we can already see that a fuller articulation 
of the compromising approach will need to appeal to some considerations other than sim- 
plicity to keep the final weighted average from being dominated by probability functions 


18 Tn this case there are no all-encompassing, nested sequences of concentric circles each of which has a 
finite population: since circles that extend past the cliff edge have infinite populations, any all-encompassing 
sequence of nested circles in which each circle has a finite population must have centres that get further and 
further away from the cliff edge. The limiting proportion of black houses in any such sequence is also 4/7. 

19 Indeed, if we modify the case so that each black house is the southernmost member of an infinite series of 
non-black houses, the north-south way of assigning probabilities will require assigning probability zero to 
being in a black house, while we can still make the east-west proportion of black houses arbitrarily close to 1. 
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defined using crazy (but simple) criteria of observerhood. The question what these consid- 
erations should look like is closely bound up with the question whether PROPORTION is 
true. It could easily turn out that the best way of excluding the crazy probability functions 
has as a consequence that the only probability functions that should receive any positive 
weight are functions that agree with PROPORTION in finite worlds (as do all the limiting 
procedures that we have considered). If this proved to be so, it would be good news for 
PROPORTION. If not, PROPORTION might start to look like an ugly and ad hoc addition, 
out of keeping with the spirit of the compromising approach. We leave further investigation 
of this question as a topic for future research. 


20.6 The Compromising Approach and the Measure Problem in Cosmology 


Let us consider one way in which modern cosmology prompts us to take seriously the 
possibility that there are infinitely many observers. According to the theory of inflation, if 
you follow the geodesic paths that are currently occupied by observable galaxies back far 
enough, you eventually — after 14 billion years or so — reach an ‘inflationary’ era during 
which the paths (as we follow them backward) approach one another at an exponential 
rate. This theory is fantastically successful by normal scientific standards. But models of 
the universe as a whole which provide a mechanism for such inflation typically feature 
eternal inflation — a kind of universe in which pockets of ordinary, non-inflating space keep 
forming, but in such a way that the inflating portion of space is never completely filled, but 
keeps expanding and giving rise to new non-inflating pockets. In the most plausible such 
models, there are many different kinds of pockets, only a few of which are hospitable to 
life. Nevertheless there is plenty of life: in fact there will be infinitely many life-friendly 
pockets as well as infinitely many life-unfriendly ones, and the life-friendly pockets will 
typically contain infinitely many observers each. We attempt to illustrate the general picture 
in Figure 20.2. 

Since these hypotheses are set in relativistic spacetime, the proposal to assign probabil- 
ities by taking limiting relative frequencies in sequences of nested spheres does not even 
make sense. Nor does any other way of assigning self-locating probabilities look to be 
overwhelmingly simpler than all the others. Instead, what we generally find is a multi- 
plicity of non-equivalent, reasonably simple cosmological measures. Here are a just a few 
examples. The ‘proper time measure’ (see Linde, 1986; Garcia-Bellido and Linde, 1995) 
is defined by starting with an (almost) arbitrary bounded, smooth, spacelike hypersurface; 
using it to construct a nested sequence of four-dimensional regions, each of which is the 
union of all timelike geodesic segments of a particular finite length, perpendicular to and 
extending futurewards from the chosen hypersurface; and assigning self-locating probabil- 
ities by taking limits within this sequence of regions, as in LIMITING PROPORTION. The 
“scale factor measure’ (De Simone et al., 2010) is similar, except that instead of following 
the geodesics along by constant amounts of proper time, we use a time co-ordinate given 
by the scale factor — essentially, we follow each geodesic as far as we need to to reach a 
hypersurface on which the distances between nearby geodesics are a constant multiple of 
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the distances between them on the initial hypersurface. For the ‘causal diamond measure’ 
(Bousso, 2006), we choose a timelike geodesic and take limits in a nested sequence of 
four-dimensional regions sandwiched between the forward light cones of points increas- 
ingly early on that geodesic and the backward light cones of points increasingly late on that 
geodesic. (Unlike the sequences of nested spheres considered in LIMITING PROPORTION, 
the sequences of regions considered by these three measures need not be all-encompassing; 
nevertheless, if the definitions of the measures are filled in properly, all the sequences of 
regions meeting the relevant criteria should yield the same limiting proportions.) There are 
also measures that are not based on taking limits in sequences of bounded regions at all: 
for example, the ‘pocket-based measure’ of Garriga et al., (2006) works in two stages, first 
generating a separate measure for each vacuum state, and then aggregating these using a 
separate recipe for assigning weights to the different vacuum states. 

Some of these measures have turned out to be ‘pathological’ in that they assign vanish- 
ingly low probability to our actual evidence — according to them, ‘almost all’ observers 
in the relevant models are, in some way or other, drastically and manifestly unlike us. For 
example, the proper time measure suffers from what is sometimes called the ‘youngness 
paradox’ (see Bousso, Freivogel and Yang, 2008; Tegmark, 2005). Since new pocket uni- 
verses are being created in the inflating region at such a high rate, at each region in one 
of the relevant nested sequences, the pocket universes that have only just been added con- 
stitute a high proportion of all the pocket universes in that region. Because there are so 
many ‘new’ pockets, the observers in the new pockets who have come into existence quite 
soon after their local Big Bang (the initial boundary of their pocket) outnumber all the 
observers in the older pockets. As a result, when we take limits using these sequences, 
observations like ours — e.g. of measuring the temperature of the microwave background 
to be 2.7 K — will be assigned far lower probabilities than the kinds of observations that 
would be expected closer to a Big Bang, e.g. measurements of higher background tem- 
peratures. Indeed, the measure is so drastically skewed towards early observers that by its 
lights, ‘almost all’ observers are ‘Boltzmann Babies’ — freak observers who fluctuate into 
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existence at times when the universe was so hot as to be extremely inhospitable to life, 
and spend the entirety of their short lives being roasted to death by their infernally hot sur- 
roundings.”° Some other measures suffer from a more familiar kind of pathology: they are 
dominated not by observers who live much closer than us to their local Big Bang, but by 
observers who live much further — Boltzmann Brains who have fluctuated into existence 
in the endless freezing vacuum. At least some versions of the pocket-based measure suffer 
from this pathology (Page, 2008). The problem here is that there is a sense in which each 
and every inhabited pocket universe with a positive cosmological constant is dominated 
by Boltzmann Brains: if you begin with a finite-volume portion of the initial boundary of 
such a pocket universe, and evolve this region further and further forwards into the future 
(excluding any ‘child’ pockets that might form inside the initial one), you will continue 
to add Boltzmann Brains without bound, but you will only ever have come across finitely 
many ordinary observers.*! This makes it quite challenging to devise a simple measure 
on observers in any one vacuum state that is not dominated by Boltzmann Brains (and, 
as usual, also dominated by Boltzmann Brains of the usual sort, living brief and bizarre 
lives).72 

There is a rough analogy here with The Cliff. The spacelike surfaces represented by 
horizontal lines in Figure 20.2 are dominated by Boltzmann Babies, just as the east-west 
lines in The Cliff are dominated by black houses, whereas timelike paths (ignoring child 
pockets) are dominated by Boltzmann Brains, just as the north-south lines in The Cliff 
are dominated by non-black houses. Fortunately, whereas in the case of The Cliff there do 
not seem to be any other comparably simple measures, in the case of eternally inflating 
spacetime there is a wider array of alternatives that have not been shown to be in any way 
pathological. 

In introducing the multiplicity of measures, cosmologists often characterise it as a deep 
foundational problem — the ‘measure problem’. Tegmark (2014, p. 314) goes so far as to 
call it ‘the greatest crisis facing physics today’. Some regard this problem as a weighty 
reason to reject inflation in favour of some rival theory: for instance, Steinhardt (2011, 
pp. 42-3) says that ‘The notion of a measure, an ad hoc addition, is an open admission 
that inflationary theory on its own does not explain or predict anything’, and rhetorically 
asks ‘If inflationary theory makes no firm predictions, what is its point?’. Many others 


20 Tt is also worth noting that, even conditional on our actual evidence about the present and past, a probability 
function generated by the proper time measure will assign high probability to the proposition that we are about 
to start getting roasted to death. 

In Figure 20.2, imagine that the two dotted geodesics are the boundaries of a region that is open to the future 
but bounded in spacelike directions. Then the intersections of this region with the ‘hot’ and ‘life-friendly’ 
parts of our pocket universe have finite spacetime volume and contain finitely many observers, whereas the 
intersection of this region with the ‘cold’ part has infinite spacetime volume, and contains infinitely many 
Boltzmann Brains. 

These claims about the existence of Boltzmann Brains in the very late parts of pockets with positive cosmo- 
logical constants are standard, but are sensitive to issues about the interpretation of quantum theory. Boddy, 
Carroll, and Pollack (this volume) and Goldstein, Struyve, and Tumulka (n.d.) point out ways in which the 
question of Boltzmann Brains may require rethinking on Everettian and pilot-wave interpretations, respec- 
tively. By contrast, the earlier remark about the domination of the proper time measure by Boltzmann Babies 
seems much less interpretation sensitive. 
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regard the problem as analogous to the divergences in quantum field theory: not a reason 
to reject the relevant hypotheses altogether, but a reason to believe that they are mere 
approximations to, or characterisations of, some emergent behaviour in some yet-to-be- 
discovered underlying theory that does not suffer from the problem. For instance, Tegmark 
(2014, p. 316) thinks that the measure problem is ‘telling us’ that we will have to give up 
on the idea that “space can have an infinite volume, that time can continue forever, and that 
there can be infinitely many physical objects’. Similarly, Bousso and Freivogel (2007) treat 
the pathological features of certain measures as a reason to treat the infinite population of 
observers as ‘figments of our imagination’, instead favouring the minimalistic (and, prima 
facie, objectionably anthropocentric) view that a causal diamond that includes everything 
causally accessible from our worldline is “all there is, as far as the semiclassical description 
of the universe goes’. 

While we have no objections to physicists following their hunches, one moral we want to 
draw from our discussion in previous sections is that the ‘measure problem’ does not con- 
stitute a reason to disbelieve the infinite-population hypotheses that give rise to it. There is 
no good a priori or empirical reason to be confident that our universe is one of the ‘well- 
behaved’ infinite ones with a unique simple measure, let alone that it is finite. The fact that 
there are some simple self-locating probability functions that give high probability to a cer- 
tain hypothesis about the structure of the world as a whole while giving a vanishingly small 
probability to our evidence does not constitute any kind of reason to reject that hypothesis. 
What would constitute a reason to reject the hypothesis would be the discovery that every 
simple self-locating probability function that assigns it substantial probability assigns van- 
ishingly small probability to our evidence.”* And we have not discovered anything like this 
in the case of eternal inflation. 

Despite the hand-wringing about foundational problems, the actual practice that cosmol- 
ogists have adopted in reasoning scientifically about infinite-population hypotheses looks 
very much like what would be recommended by our “compromising” account of the role 
of simplicity considerations in reasonable priors. A theorist will come up with a defini- 
tion of a measure that works for a certain model of the cosmos, and try to figure out what 
kinds of observations are probable according to that measure (often a difficult task). When 
a particular measure is shown to have some pathological feature such as assigning high 
probability to being a Boltzmann Baby, the theorist’s reaction is not to give up straight 
away on the relevant cosmological model, or suddenly to start treating sceptical scenarios 
in which we actually are Boltzmann Babies (or freak observers of some other sort) as live 
options. Rather, the theorist will start looking for some alternative simple measure which 
does not suffer from any such pathology. The goal is to find a pair of a simple cosmological 
hypothesis and a simple measure that together give reasonably high probability to certain 


23 Bousso and Freivogel seem to think that this eliminativist attitude is required by the use of their causal diamond 
measure, but we do not see why this should be so. 

24 ‘Vanishingly small’ here really means: small by comparison with the probability assigned to our evidence by 
other simple self-locating probability functions that do not assign high probability to the given cosmological 
hypothesis. 
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characteristic properties attributed to us by our actual evidence, such as observing a back- 
ground temperature not too far from what we actually observe. And the cosmologists seem 
to be making considerable progress towards this goal — indeed the whole enterprise looks 
like science at its best. 

All of this is exactly as it should be on the compromising approach. When generating 
reasonable self-locating priors from reasonable qualitative priors by setting CU am F|H) = 
>; wiC(wi(F )|H), it will often happen that some of the jz; simple enough for the weight 
factor w; to be substantial will be ‘pathological’ in the sense that they give a very low 
measure even to the barest outlines of our actual evidence. For example, there may be 
some simple jz; such that H entails that 1; (being a Boltzmann Baby in the midst of being 
roasted to death) is close to 1. But this is not a problem: so long as there are any reasonably 
simple measures that are not pathological in this way, the final weighted average will be 
dominated by those terms. That is: our posteriors will be approximately as they would be 
if we had started, not with our actual priors, but with a weighted average that excluded the 
pathological measures. Indeed, it is reasonable to hope that by doing the right experiments, 
we can enrich our total evidence to the point where one simple measure jz, will assign it 
a vastly higher probability than any other comparably simple measure. In that case, our 
posterior self-locating credence C;(/ am F) = C(I am FE)/C( am E) will be approximately 
equal to C(x (FE))/ Cex (E)). For the purposes of making predictions, we will not have 
to think about anything other than the particular simple measure jz;, and we will not need to 
concern ourselves with detailed questions about how exactly simplicity should be measured 
and weighted. 


20.7 ‘Which Measure does Nature Subscribe to?’ 


Cosmologists sometimes say rather mysterious things in describing what we are learning 
about the world when we engage in this process of defining different measures and trying 
to find out which ones assign high probability to our evidence. For example, Max Tegmark 
describes the enterprise as follows: 


There is some correct measure that nature subscribes to, and we need to figure out which 
one it is, just as was successfully done in the past for the measures allowing us to compute 
probabilities in statistical mechanics and quantum physics. (Tegmark, 2005, p. 2) 


This remark suggests the following picture. In addition to facts of the familiar sort studied 
by physics (facts about the disposition of fields in spacetime, the wave function, and so on), 
there are facts about which measure nature subscribes to. We can investigate such facts 
empirically, since our evidence that we have a certain property F counts in favour of the 
hypothesis that nature subscribes to a measure jz for which jz(F) is high. More generally: 
for any measure jz, when we conditionalise any reasonable prior credence function C on 
the proposition that nature subscribes to jz, the result (if well defined) is just given by yu 
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itself: 
DEFER TO NATURE Cd am F|Nature subscribes to 1) = C(u(F)) 


Given this, all that remains to be done to pin down a particular reasonable prior credence 
function C is to specify C(Nature subscribes to jc) for every jz. Presumably this should be 
higher for simpler ju. 

Although Tegmark’s remark is unusually explicit, the picture is not unique to him. It is 
also suggested by a certain way of using the word ‘theory’ that some cosmologists favour 
in this context, on which a theory is something that “builds in’ or ‘comes with’ a particular 
measure or self-locating probability function. As Linde (2007, p. 32) puts it: ‘the proba- 
bility measure becomes a part of the theory, and we test both the theory and the measure 
by comparing them with observations’. By itself this is a harmless terminological choice; 
but it becomes consequential when it is combined with the natural assumption that theories 
are propositions, things capable of being true or false.”° For clearly it makes no sense to 
say that a measure — which is just a function from properties to real-valued random vari- 
ables obeying certain axioms — is true or false. The question has to be whether the measure 
enjoys some special status that plays something like the epistemological role characterised 
by DEFER TO NATURE. And the choice becomes a serious metaphysical commitment if 
it is combined with the further assumption that the truth or falsity of the relevant theo- 
ries remains a qualitative question, so that the relevant distinguished status is not merely a 
distinguished relation that the relevant measure stands in to us.7° 

We find these putative facts about what measure nature subscribes to obscure and prob- 
lematic. We see no good reason to posit them at all, rather than taking seriously a more 
economical picture of reality as fully characterised by facts of a more familiar physical 
kind (such as facts about fields in spacetime). 

You may have noticed that DEFER TO NATURE is structurally parallel to (PP), the stan- 
dard ‘Principal Principle’ that specifies how reasonable prior credence functions behave 
conditional on a hypotheses about what probability function enjoys a different special 
status, that of being the true objective chance function. This might suggest that those 
who do not regard facts about objective chance as objectionably spooky should have no 
metaphysical objection to facts about what measure nature subscribes to. But there is at 
least the following disanalogy between the two cases. One widely held view about objec- 
tive chance is Humean reductivism, which in its best-developed form (Elga, 2004; Lewis, 
1994), identifies the proposition that a probability function P is the objective chance func- 
tion with the proposition that P optimally balances the desiderata of simplicity and ‘fit’ 
(assigning high probability to truths, especially simple truths). It is reasonable to hope 


25 We do not attribute this assumption to Linde. 

26 For example, Page (this volume) proposes that each ‘complete theory for a universe’ should ‘give normalized 
probabilities for the different possible observations O; that it predicts, so that for each 7;, the sum of P(O;|7;) 
over all Oj is unity’, while also defining a ‘complete theory for a universe’ as one that ‘completely describes 
or specifies all properties of that universe’, in a context where it it is clear that the ‘properties’ in question are 
qualitative properties. 
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that by finding the right weighting of these factors, we could define a property F’, neces- 
sarily instantiated by at most one probability function, such that P(P is F’) will be close 
to 1 for any P simple enough to deserve substantial weight in a reasonable prior cre- 
dence function. In that case, C(-|P is F) will be well approximated by P, so that (PP) 
will be approximately true (see Lewis, 1994, §10). We could certainly adopt a similar, 
reductivist Humean account of “being the measure to which nature subscribes’. But on this 
understanding, the talk of ‘nature’ is deeply misleading, since the proposition that nature 
subscribes to jz will be a self-locating proposition rather than a qualitative one. If there 
is a simple measure jz that assigns a much higher probability to the properties that truly 
characterise me than any other simple measure, then I will speak truly when I say ‘Nature 
subscribes to jz’. But someone else (some Boltzmann Brain, say), whose qualitative prop- 
erties are atypical according to jz but typical according to some other simple measure 
j*, will say something false in uttering the same sentence. If we want to conceive of 
facts about what measure nature subscribes to as qualitative, Humean reductivism is not an 
option. 

Our metaphysical objection to positing irreducible facts about what measure nature 
subscribes to will, of course, not weigh so heavily with those who already endorse an 
anti-Humean realist account of objective chance. But even they should want some argu- 
ment for positing such facts. In the case of facts about objective chance, one can point to 
the pervasive role the concept seems to play in our ordinary thought and in the sciences. 
By contrast, the concept ‘measure to which nature subscribes’ seems like a recent innova- 
tion; so far, no-one seems to have given anything that looks like an argument for positing 
a domain of facts answering to this concept. 

One argument that might be given for positing such facts is that there is no way to under- 
stand the rationality and cognitive significance of the cosmologists’ enterprise of looking 
for simple cosmological models and simple measures on those models which assign rea- 
sonably high probability to our evidence — without making such a posit. But one of the 
morals of our discussion in the previous sections is that this argument is unsuccessful. 
Take a reasonable prior credence function as conceived by someone who posits facts about 
what nature subscribes to, and simply delete all the probabilities assigned to propositions 
about that peculiar subject matter, leaving the probabilities of ordinary qualitative and self- 
locating propositions unchanged. So long as the original probability function gave simpler 
measures a higher probability of being subscribed to, the output of this procedure will 
fit our compromising picture of reasonable priors: it will be a weighted average of self- 
locating probability functions in which simple ones receive higher weight. Thus, insofar as 
we are interested in how our evidence bears on these ordinary questions, there is no need to 
take seriously the idea that we are uncovering hidden facts about what nature subscribes to. 
We can treat this as nothing more than a colourful manner of speaking: for example, we can 
say “We have learnt that nature subscribes to jz” when what we really mean is something 
like ‘We have received evidence such that the result of conditionalising our priors on it is 
approximately equal as the result of conditionalising a self-locating probability function 
corresponding to jz on it’. 
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20.8 Conclusion 


To sum up: for completely ordinary reasons, we should realise that the propositions we 
believe and that constitute our evidence are not always qualitative, and we should imple- 
ment the familiar Bayesian idea that a reasonable credence function is derived from 
reasonable priors by conditionalisation on evidence in a way that treats self-locating propo- 
sitions and qualitative propositions as being on a par. In such a framework there is nothing 
mysterious or surprising about the idea that our total self-locating evidence J am E might 
support one qualitative proposition over another, even when both propositions entail that 
someone is E, or entail that there are infinitely many people. The question when this 
happens reduces to the question what priors are reasonable. We should not, in general, 
expect to be able to establish our claims about reasonable priors by deducing them from 
precisely stated, uncontroversial principles, or regard it as a deep problem for some hypoth- 
esis about the world if we cannot find principles of this sort which completely pin down 
the result of conditionalising any reasonable prior credence function on that hypothesis. 
(If this is what it means for a hypothesis to make ‘firm predictions’, firm predictions are 
overrated.) Once we give up on the misguided hope for a knockdown demonstration that 
reasonable priors have to work in a certain way, we can get a long way with the famil- 
iar, vague idea that reasonable priors should be influenced by considerations of simplicity. 
In particular, the idea that reasonable priors are weighted averages of simple probabil- 
ity functions, in which simpler probability functions are weighted more heavily, yields 
prescriptions for reasoning about infinite-population hypotheses that are both intuitively 
plausible, and a good fit for the methodology that cosmologists have actually adopted in 
practice. 
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On Probability and Cosmology: Inference 
Beyond Data? 


MARTIN SAHLEN 


21.1 Cosmological Model Inference with Finite Data 


In physical cosmology we are faced with an empirical context of gradually diminishing 
returns from new observations. This is true in a fundamental sense, since the amount of 
information we can expect to collect through astronomical observations is finite, owing to 
the fact that we occupy a particular vantage point in the history and spatial extent of the 
Universe. Arguably, we may approach the observational limit in the foreseeable future, at 
least in relation to some scientific hypotheses (Ellis, 2014). There is no guarantee that the 
amount and types of information we are able to collect will be sufficient to statistically test 
all reasonable hypotheses that may be posed. There is under-determination both in princi- 
ple and in practice (Butterfield, 2014; Ellis, 2014; Zinkernagel, 2011). These circumstances 
are not new, indeed cosmology has had to contend with this problem throughout history. 
For example, Whitrow (1949) relates the same concerns, and points back to remarks by 
Blaise Pascal in the seventeenth century: “But if our view be arrested there let our imagi- 
nation pass beyond; ... We may enlarge our conceptions beyond all imaginable space; we 
only produce atoms in comparison with the reality of things’. Already with Thales, episte- 
mological principles of uniformity and consistency have been used to structure the locally 
imaginable into something considered globally plausible. The primary example in contem- 
porary cosmology is the Cosmological Principle of large-scale isotropy and homogeneity. 
In the following, the aim will be to apply such epistemological principles to the procedure 
of cosmological model inference itself. 

The state of affairs described above naturally leads to a view of model inference as infer- 
ence to the best explanation/model (e.g. Lipton, 2004; Maher, 1993), since some degree of 
explanatory ambiguity appears unavoidable in principle. This is consistent with a Bayesian 
interpretation of probability which includes a priori assumptions explicitly. As in science 
generally, inference in cosmology is based on statistical testing of models in light of empir- 
ical data. A large body of literature has built up in recent years discussing various aspects 
of these methods, with Bayesian statistics becoming a standard framework (Hobson, 2010; 
Jaynes, 2003; von Toussaint, 2011). The necessary foundations of Bayesian inference will 
be presented in the next section. 
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Turning to the current observational and theoretical status of cosmology, a fundamental 
understanding of the dark energy phenomenon is largely lacking. Hence we would like to 
collect more data. Yet all data collected so far point consistently to the simplest model of 
a cosmological constant, which is not well understood in any fundamental sense. Many of 
the other theoretical models of dark energy are also such that they may be observationally 
indistinguishable from a cosmological constant (Efstathiou, 2008). Another important area 
is empirical tests of the inflationary paradigm, the leading explanation of the initial condi- 
tions for structure in the Universe (e.g. Smeenk, 2014). Testing such models necessitates, 
in principle, the specification or derivation of an a priori probability of inflation occurring 
(with particular duration and other relevant properties). The measure problem is the ques- 
tion of how this specification is to be made, and will be the departure point and central 
concern in the following sections. 

We will argue that the measure problem, and hence model inference, is ill defined due to 
ambiguity in the concepts of probability, global properties, and explanation, in the situation 
where additional empirical observations cannot add any significant new information about 
some relevant global property of interest. We then turn to the question of how model infer- 
ence can be be made conceptually well defined in this context, by extending the concept of 
probability to general valuations (under a few basic restrictions) on partially ordered sets 
known as lattices. On this basis, an extended axiological Bayesianism for model inference 
is then outlined. The main purpose here is to propose a well-motivated, systematic formal- 
isation of the various model assessments routinely, but informally, performed by practising 
scientists. 


21.2 Bayesian Inference 


Inference can be performed on different levels. An important distinction is that between 
parameter and model inference: the first assumes that a particular model is true and derives 
the most likely model parameter values on the basis of observations, whereas the latter 
compares the relative probability that different models are true on the basis of observations. 
The two inferential procedures can be regarded as corresponding epistemically to descrip- 
tion (parameter inference) and explanation (model inference), respectively. This chapter 
will focus on model inference, which becomes particularly troublesome in the global cos- 
mological context. We present both cases below for completeness. For more on Bayesian 
inference, see e.g. Jaynes (2003). 


21.2.1 Parameter Inference 
Bayesian parameter inference is performed by computing the posterior probability 


L(D|0;M)T1(0;M 
p(6|D; M) = ( ae ) (21.1) 


where D is some collection of data, M the model under consideration and 6 the model 
parameters. The likelihood of the data is given by £(D|@;M) and the prior probability 
distribution is I1(0;M). The normalisation constant P(D; M) is irrelevant for parameter 
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inference, but central to model inference, and will be discussed next. The expression above 
is known as Bayes’ theorem. 


21.2.2 Model Inference 


The Bayesian evidence for a model M given data D can be written 
P(D; M) = [coe.mne:mao, (21.2) 


where symbols are defined as above. The Bayesian evidence is also called the marginalised 
likelihood, reflecting the fact that it measures the average likelihood across the prior distri- 
bution, and is thus a measure of overall model goodness in light of data and pre-knowledge. 
It is used in inference to compare models, with a higher evidence indicating a better model. 
Conventional reference scales (e.g. the Jeffreys scale) exist to suggest when a difference in 
evidence is large enough to prefer one model over another (Hobson, 2010). 

Looking at Eq. (21.2), the Bayesian evidence is clearly sensitive to the specification of 
the prior distribution. A prior is usually specified based on previous empirical knowledge 
from parameter estimation, or may be predicted by the theoretical model, or given by some 
aesthetic principle. This highlights two things: without empirical pre-knowledge, a prior is 
entirely based on theoretical or philosophical assumption, and a prior is also not cleanly 
separable from the model likelihood and can therefore to a degree be regarded as part of 
the model. As increasing amounts of data are collected, the influence of the initial prior is 
gradually diminished through the process of Bayesian updating, 1.e. the current posterior 
probability becomes the (new) prior probability for a future data analysis. Through this 
process, the posterior eventually converges to a distribution essentially only dependent on 
the total numbers of and accuracies of measurements. Increasingly numerous and precise 
measurements make the initial prior insignificant for the posterior. When data are extremely 
limited relative to the quantity/model of interest, this process stops short and the initial prior 
can then play a significant role in the evidence calculation. This will be of importance in 
the discussion that follows. 


21.3 Global Model Inference in Cosmology 


Cosmology, by its nature, seeks to describe and explain the large-scale and global proper- 
ties of the Universe. There is also, by the nature of the field, a problem of finite data and 
underdetermination that becomes particularly poignant for measuring and explaining some 
global properties of the Universe. This will typically be associated with features on physi- 
cal scales corresponding to the size of the observable Universe or larger, or features in the 
very early Universe. On the one hand, there is an epistemological question of knowledge 
based on one observation (i.e. the one realisation of a universe we can observe): how accu- 
rate/representative is our measurement? On the other hand, there is an ontological question 
of whether a property is truly global: if not, how might it co-depend on other properties, 
with possible implications for the evaluation of probabilities and inference? We shall there- 
fore distinguish epistemically and ontically global properties in the following. In general, 
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a global property will be defined here as some feature of the Universe which remains 
constant across the relevant domain (e.g. observable Universe, totality of existence). 

A conventional approach to global properties is to treat separate regions of the Universe 
as effectively separate universes, such that sampling across regions in the Universe cor- 
responds to sampling across different realisations of a universe. While this approach is 
useful for understanding the statistics of our Universe on scales smaller than the observ- 
able Universe, when approaching this scale the uncertainty becomes increasingly large and 
eventually dominates. This uncertainty is commonly called cosmic variance (see e.g. Lyth 
and Liddle, 2009). We will explicitly only be concerned with the case when this cos- 
mic variance cannot be further reduced by additional empirical observations to any 
significant degree, for some global property of interest. 

A case in point of particular contemporary relevance concerns the initial conditions of 
the Universe — what is the statistical distribution of such initial conditions? This is often 
described as ‘the probability of the Universe’, or ‘the probability of inflation’ since the 
inflationary paradigm is the leading explanation for producing a large, geometrically flat 
universe and its initial density fluctuations. More formally, it is known as the measure 
problem: what is the probability measure on the space of possible universes (known as 
multiverse)? The measure problem is important, because parameter inference might non- 
negligibly depend on this measure, and model inference should non-negligibly depend on 
this measure. Meaningfully performing inference at this level of global properties therefore 
depends on finding some resolution for how to approach the measure problem. In recent 
years, this has led to intense debate on the scientific status of inflation theory, string the- 
ory, and other multiverse proposals (Carr, 2007; Dawid, 2013, 2015; Ellis and Silk, 2014; 
Kragh, 2014; Smeenk, 2014; Steinhardt, 2011). 

This is not the place for addressing the range of approaches to this problem in the lit- 
erature. Proposals commonly rely on some relative spatial volume, aesthetic/theoretical 
principle (e.g. Jeffreys prior, maximum entropy), or dynamical principle (e.g. energy con- 
servation, Liouville measure). The reader is referred to Carr (2007); Smeenk (2014) and 
references therein for more details. 


21.4 The Measure Problem: A Critical Analysis 
21.4.1 Preliminaries 


It is helpful to recognise that the measure problem is a sub-problem, arising in a particular 
context, related to the broader question of how to perform model inference in relation to 
global properties of the Universe. It arises as a problem from the desire to provide expla- 
nation for some certain global properties of the Universe, and so depends on a view of 
what requires explanation and what provides suitable explanation. In pursuing statistical 
explanation, the problem naturally presents itself through the application of conventional 
Bayesian statistical inference as we have seen above, and particularly in the calculation of 
Bayesian evidence, where the assignment of a prior probability distribution for parameter 
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values is essential. The model and/or prior will also explicitly or implicitly describe how 
different global properties co-depend, and more generally prescribe some particular struc- 
ture for the unobserved ensemble of universes (hence, the multiverse). This ensemble may 
or may not correspond to a physically real such structure. 

In addressing the measure problem, one might therefore explore the implications of 
varying the conditions, assumptions, and approaches described above. To what extent is the 
measure problem a product thereof? We will consider this question in the following. The 
analysis echoes issues raised in e.g. Ellis’ and Aguirre’s contributions in Carr (2007), Ellis 
(2014) and Smeenk (2014), while providing a new context and synthesis. In the following, 
Kolmogorov probability will be contrasted with Bayesian probability for illustration and 
motivation. We note that other foundations and definitions of probability, which we do not 
consider, also exist (e.g. de Finetti’s approach) — see Jaynes (2003). 


21.4.2 Analysis 


Structure of Global Properties 


Statistical analysis for global properties in cosmology typically relies on certain unspoken 
assumptions. For example, it is commonly assumed that the constants of Nature are statis- 
tically independent, or that a multiverse can be meaningfully described by slight variations 
of the laws of physics as we know them — for example as in Tegmark er al. (2006). For 
many practical purposes, these assumptions are reasonable or irrelevant. However, in some 
cosmological contexts, especially in relation to global properties and questions of typical- 
ity/fine-tuning, such assumptions can impact on the conclusions drawn from observations. 
For example, it has been argued that fine-tuning arguments rely on untestable theoretical 
assumptions about the structure of the space of possible universes (Ellis, 2014). 

A distinction was made in the preceding section between epistemically global and onti- 
cally global properties of the Universe. A central point is that it is impossible to make 
this distinction purely observationally: the set of ontically global properties will intersect 
the set of epistemically global properties, but which properties belong to this intersection 
set cannot be determined observationally. Hence, it is possible that some ontically global 
properties remain unknown, and that some epistemically global properties are not onti- 
cally global. This implies that in general, global properties will be subject to an uncertainty 
associated with these sources of epistemic indeterminacy. 

In consequence, when seeking to determine and explain some certain global properties 
through analysis of observational data, the possibility that the values of these global prop- 
erties could depend on some other global properties — known or unknown — cannot be 
excluded empirically. One example of this possibility concerns the constants of Nature, 
whose values may be interdependent (as also predicted by some theories). Another exam- 
ple is the distinction between physical (global) law and initial conditions of the Universe: 
in what sense are these concepts different? They are both epistemically global proper- 
ties, and from an epistemological point of view clearly interdependent to some extent (e.g. 
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“observable history = initial conditions + evolution laws’). Yet their epistemic status is often 
considered to be categorically different, on the basis of extrapolated theoretical structure. 
These issues are discussed further in e.g. Ellis (2014) and Smeenk (2014). 

To explore these ideas further, let us consider a cosmological model described by some 
global parameters 6) (e.g. constants of Nature or initial conditions). This model also con- 
tains a specification of physical laws, which normally are considered fixed in mathematical 
form. For the sake of argument, let us now assume that deviations from these laws can be 
meaningfully described by some additional set of parameters 56). Such a 56 will give rise 
to a shift 5£(6p, 6) in the data likelihood for our assumed exhaustive observations, relative 
to the same likelihood assuming standard physical laws. 

The function 5£ will in principle depend on the relationship between 4) and 66 
in some more general model picture (we have no reason a priori to exclude such 
a more general picture). The shift 56; should also affect the prior (6p). While this 
may not be explicitly stated, a parameter prior for O) is always specified under the 
assumption of certain physical laws. For the case of global parameters, the distinction 
between parameters and laws becomes blurred, as they are both global properties: they 
may in fact be co-dependent in some more general model. However, the correlation 
matrix (or other dependence) between 6, and 66 cannot be independently determined 
from observations, since only one observable realisation of parameter values is avail- 
able (i.e. our one observable Universe). Hence, the shift 56; should in general induce a 
shift II — II + 61] in the assumed prior, due to the empirically allowed and theoret- 
ically plausible dependencies between 6, and 66). On this basis, there will in general 
be some function 6I] that should be included for probabilistic completeness. But this 
function is essentially unconstrained since it cannot be independently verified or falsi- 
fied. Hence, we are in principle always free to renormalise the Bayesian evidence by 
an arbitrary (non-negative) amount without violating any empirical constraints. Model 
inference in the conventional sense therefore becomes ill defined/meaningless in this 
context. 

The problem here is that while we know that there in general should be co-depend- 
encies/correlations between laws/parameters, we are unable to account for them. This 
means that Kolmogorov’s third axiom (the measure evaluated on the ‘whole set’ equals the 
sum of the measures on the disjoint subsets) is generically violated due to unaccounted-for 
correlations between known and unknown global properties (see Jaynes, 2003; Kol- 
mogoroy, 1933, for details on Kolmogorov’s probability theory). The axiom could be 
regarded as unphysical in this context. This may also lead to problems satisfying Kol- 
mogorov’s first (non-negative probabilities) and second axiom (unitarity) for the prior. 
While Bayesian probability based on Cox’s theorem (Cox, 1946, 1961) does not require 
Kolmogorov’s third axiom (discussed further in the following subsection), potential prob- 
lems equally arise in relation to negative probabilities and non-unitarity. Therefore, we find 
that probability in this context is better thought of as quasiprobability (occurring also e.g. 
in the phase space formulation of quantum mechanics, Wigner, 1932). 
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Foundations for Inference 


Statistical analysis of empirical measurements is the paradigm within which inference 
usually takes place in science, including cosmology. It is therefore important for us to con- 
sider the foundations of probability, as applicable to this process. A central distinction as 
regards probability is that between physical probability and inductive probability. The first 
is some putative ontic probability, the second corresponds to the epistemological evalua- 
tion of empirical data. Albrecht and Phillips (2014) claim that there is no physically verified 
classical theory of probability, and that physical probability rather appears to be a funda- 
mentally quantum phenomenon. They argue that this undermines the validity of certain 
questions/statements based on non-quantum probability and makes them ill-defined, and 
that a ‘quantum-consistent’ approach appears able to provide a resolution of the measure 
problem. The precise relationship between physical and inductive probability is a long- 
standing topic of debate, which we will not go into great detail on here (for a review, see 
Jaynes, 2003). The essential point for our discussion is the possible distinction between 
the two, and the idea that inductive probability can be calibrated to physical probability 
through repeated observations, e.g. in a frequency-of-outcome specification, or a Bayesian 
posterior updating process. In that way, credence can be calibrated to empirical evidence. 

This procedure fails when considering the Universe as a whole, for which only one 
observable realisation exists. A conventional inductive approach to probability therefore 
becomes inadequate (unless one assumes that the observable Universe defines the total- 
ity of existence, but this is a rather absurd and arbitrary notion which inflationary theory 
also contradicts). This can also be regarded as a failure to satisfy the second axiom in 
Kolmogorov’s definition of probability: that there are no elementary events outside the 
sample space (Jaynes, 2003; Kolmogorov, 1933). Without a well-defined empirical cali- 
bration of sample space and evidence, one is at risk of circular reasoning where inductive 
probability is simply calibrated to itself (and hence, the a priori assumptions made in 
the analysis). Related situations in cosmology have been discussed in the context of the 
inductive disjunctive fallacy by Norton (2010). 

Bayesian statistics has become the standard approach in cosmology, due to the limita- 
tions of “frequentist’ methods when data are scarce in relation to the tested hypotheses 
(the whole-Universe and Multiverse cases being the extreme end of the spectrum). The 
formalism of Bayesian statistics can be motivated from rather general assumptions. For 
example, Cox’s theorem shows that Bayesian statistics is the unique generalisation of 
Boolean logic in the presence of uncertainty (Cox, 1946, 1961). This provides a logi- 
cal foundation for Bayesian statistics. In recent years, it has further been shown that the 
Bayesian statistical formalism follows from even more general assumptions (the lattice 
probability construction, see Knuth and Skilling, 2012; Skilling, 2010), which we will 
return to in Section 21.5.1. 

There is a difference between the definition of probability based on Kolmogorov’s 
three axioms (Jaynes, 2003; Kolmogorov, 1933), and Bayesian probability based on Cox’s 
theorem (Cox, 1946, 1961; Van Horn, 2003). Both constructions are measure-theoretic 
in nature, but Kolmogorov probability places a more restrictive requirement on valid 
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measures. In Bayesian probability, measures are finitely additive, whereas Kolmogorov 
measures are countably additive (a subset of finitely additive measures). This means that 
in Bayesian probability (based on Cox’s theorem), in principle the measure evaluated on 
the full sample space need not equal the sum of the measures of all disjoint subsets of the 
sample space. This can be understood to mean that integrated regions under such a measure 
do not represent probabilities of mutually exclusive states. In the preceding subsection, we 
discussed this possibility in the context of unaccounted-for correlations in the structure of 
global properties. In the Bayesian statistical set-up, this is thus not a problem in principle, 
although problematic negative probabilities and non-unitarity could also occur in this case. 

We can thus see certain benefits with a modern Bayesian statistical framework, rela- 
tive to the Kolmogorov definition of probability, even though some issues also present 
themselves. This leaves, at least, an overall indeterminacy in ‘total probability’ (through 
611). More broadly, it also remains an open question what the status of the logical foun- 
dations is. Is there a unique logical foundation? What is its relation to the physical 
Universe/Multiverse — is it a physical property? (cf. Putnam, 1969). 


Modes of Explanation 


In addition to the above, a fundamental question is which phenomena, or findings other- 
wise, are thought to require explanation (on the basis of current theoretical understanding), 
and what provides that explanation. In the case of the measure problem, it is often con- 
sidered that the initial conditions of the Universe appear to have been very special, i.e. in 
some sense very unlikely, and therefore need to be explained (Ellis, 2014; Smeenk, 2014). 
This proposition clearly rests on some a priori notion of probable universes, often based 
on extrapolations of known physics (e.g. fixed global laws). 

The main mode of explanation in cosmology is based on statistical evaluation of 
observational data. This is usually done using the formalism of Bayesian statistics. The 
conventional approaches for providing a solution to the measure problem are also sta- 
tistical/probabilistic in nature, and can be regarded as picking some ‘target’ probability 
regime that is to be reached for the posterior probability, for a proposed measure to be con- 
sidered explanatory (another alternative is some strong theoretical/structural motivation). 
There are broadly speaking two modes of explanation in this vein: ‘chance’ and ‘neces- 
sity’. For example, the measured values of physical constants are to be highly probable 
(fine-tuning/anthropics), average (principle of mediocrity), or perhaps necessarily realised 
at least ‘somewhere’ (string-theory landscape). However, in view of the discussion in the 
preceding subsections, it appears impossible to independently establish such probabilities, 
and hence the notions of and distinctions between chance and necessity become blurred. 
Statistical explanation therefore suffers from the same problems as detailed in the pre- 
ceding subsections, with the risk for circular confirmatory reasoning that in reality is not 
actually explanatory. This critique echoes common objections to the epistemic theories 
of justification called foundationalism and coherentism. Epistemic justification based on 
coherentism can provide support for almost anything through circularity, and foundation- 
alism can become arbitrary through the assertion of otherwise unjustified foundational 
beliefs. Neither of these two epistemological approaches appear able to provide satisfactory 
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epistemic justification in response to the ambiguity in the effective Bayesian prior that we 
are discussing. 

In terms of model structure, the typical form of explanation takes the shape of evolution- 
ary, universal laws combined with some set of initial conditions at the ‘earliest’ time. Some 
additional aesthetic/structural criteria such as simplicity, symmetry, and conserved quanti- 
ties may also be implicitly invoked. These are usually introduced as part of the theoretical 
modelling, rather than as intrinsically integrated with the inferential procedure. Therefore, 
possible co-dependencies with other explanatory criteria (which may be thought of as a 
type of global properties!) are usually not considered or explored. 

In conclusion, statistical explanation is ill defined in the context of the measure problem 
in Bayesian statistics, and the relation to other possible explanatory principles typically 
neglected. How to interpret a Bayesian evidence value, or differences between such values 
for different models, is therefore here not clear. 


21.4.3 A Synthesis Proposal 


We thus find that the measure problem is ill-defined, in principle, in conventional Bayesian 
statistical inference in the measure problem context. This is due to a compound ambiguity 
in the definition of probability, prior specification, and evidence interpretation — based on 
the observations above that the concepts of 


e laws and global parameters/initial conditions 
e probability 
e explanation 


are ambiguous when considering a global, whole-Universe (or Multiverse) context. Hence, 
measure problem solution proposals in the Bayesian statistical context ultimately are 
subjective statements about how ‘strange’ or ‘reasonable’ one considers the observable 
Universe to be from some particular point of view. This suggests that conventional 
Bayesian model inference concerning global properties in the circumstances we consider 
is non-explanatory, and at best only self-confirmatory. Additional information is needed to 
provide meaningful epistemic justification. 

The ambiguities listed above can be regarded, in the language of conventional Bayesian 
inference, as each introducing some effective renormalisation in the expression for 
Bayesian evidence. Can these ambiguities be accommodated in such a way that infer- 
ence becomes conceptually well defined? There appear to be three possible, consistent, 
responses to this: 


1. Declare model inference meaningless in this context and in the spirit of Wittgenstein 
stay silent about it. 

2. Introduce additional philosophical/explanatory principles that effectively restrict the 
prior (e.g. an anthropic principle, a dynamical or aesthetic principle on model space). 

3. Introduce novel empirical/epistemic domains (e.g. mathematical-structure space of 
accepted scientific theory, possibility space of accepted scientific theory). 
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This author argues that the natural and consistent approach involves a combination of 2. 
and 3. To pursue this, we need to explicitly address the ambiguous concepts and the nature 
of the ambiguity. The lattice conception of probability (reviewed in Section 21.5.1), includ- 
ing the measure-theoretic Bayesian probability as a special case, provides a way to do this. 
It allows the generalisation of Bayesian probability and evidence to include more general 
valuation functions which can encompass the ambiguity in model structure, probability 
and explanation in a consistent way. It also demonstrates that combining valuation func- 
tions is done uniquely by multiplication, suggesting that the identified ambiguities can be 
decomposed in such a way. 

In conclusion, the measure problem has an irreducible axiological component. While 
there may be some hope of falsifying certain extreme measures, it should however be clear 
that any successful resolution would need to address this axiological component. The pro- 
posed construction constitutes a natural extension of the Bayesian statistical framework, 
and can be motivated and naturally implemented starting from the notions of partially 
ordered sets and measure theory. It aims to explicitly include valuations which are con- 
ventionally left implicit, to provide a conceptually and theoretically consistent framework. 
The workings of such an axiological Bayesianism will now be presented. 


21.5 Axiological Bayesianism 
21.5.1 Lattice Probability 


Kevin H. Knuth and John Skilling have developed a novel approach to the foundations 
of Bayesian probability, based on partially ordered sets and associated valuations (Knuth 
and Skilling, 2012). It generalises Kolmogorov’s and Cox’s earlier work. An overview is 
given in Skilling (2010), but the main features will be outlined here. The construction starts 
off from a general set of possibilities, for example a set of different models or parameter 
values, but where our ultimate purpose is to quantitatively constrain to a subset of inferred 
preferable possibilities. The set is built up by a ‘null’ element, a set of basis elements 
(e.g. {Model A, Model B}), and the set of all combinations of logical disjunctions (‘OR’) 
between the basis elements (e.g. {Model A-OR-Model B}). 

On this set, partial ordering is defined, denoted here by ‘<’. For elements x and y, we 
have that x < y means that y includes x. The ordering is required to be transitive, i.e. 


x<yandy<z=>=$.x<z. (21.3) 


The concept of least upper bound is introduced separately. The least upper bound to x and 
y, if it exists, is the least element at or including both x and y. We denote it by x Vv y. The 
greatest lower bound of x and y is defined analogously, and denoted x A y. 

A lattice is a partially ordered set with a well-defined least upper bound, reflecting the 
idea that the ordering induces a structure on the set. It also obeys (among other conventional 
axioms) associativity, (x V y) V z= x V (y V z). This property is central to the probability 
construction based on valuations that now follows. 
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On the lattice, a function prescribing a quantitative valuation to each lattice element is 
then introduced. The purpose of this valuation is to rank elements. Requiring that such a 
valuation respects the ordering and the lattice structure, it can be shown that any valuation 
m must satisfy 


m(xV y) = mx) +m), (21.4) 


without loss of generality. This is essentially what defines a mathematical measure. From 
this it also follows that the valuation of general lattice elements (constructed via use of “v’) 
can be built up by addition from valuation prescriptions for the basis elements. 

One can also consider a direct product ‘x’ of lattices, which under similar assumptions 
on mas above leads to the requirement 


m(x X y) = m(x)m(y) (21.5) 


on the valuation m. Combinations are thus always multiplicative. This will be of particular 
importance in the following. 

Turning to the question of how to define probability, it can be shown that under preser- 
vation of lattice structure, associativity and unitarity, conventional Bayesian probability 
calculus follows, with a probability p defined by 


m(x A t) 


aoe (21.6) 


P|) = 
where ¢ is some lattice context that one considers a priori. This expression generalises the 
conventional probability concept and calculus to any valuation concordant with ordering, 
lattice structure and associativity. It provides therefore a basis for a generalised inference 
procedure. A prescription for how to reason rationally also within the ambiguous context 
we consider. 


21.5.2 Gevidence: Generalisation of Bayesian Evidence 


The concept of explanation is intrinsically tied to the concept of probability in the context 
of statistical explanation. In generalising the concept of probability to general valuations 
(which, as shown above, must also be mathematical measures in the Knuth—Skilling con- 
struction), statistical explanation can therefore in that process also be generalised. Such a 
generalisation will then involve an evaluation of how well a model corresponds to some 
set of explanatory principles encoded in valuations (e.g. predicting empirical observations, 
satisfying aesthetic criteria, etc.). We therefore turn to the question of how Bayesian evi- 
dence can be generalised on the basis of the lattice probability construction, to provide a 
resolution of the conceptual problems associated with the measure problem. A key ques- 
tion for implementing a generalisation of Bayesian evidence for model inference, is how 
to combine several valuations corresponding to different explanatory criteria and empiri- 
cal/epistemic domains, into a compound ‘net’ valuation. The preceding subsection gives 
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the unique answer: by multiplication. Given a set of valuations/lattice representations, there 
is thus a unique way to define a probability based on these through multiplication. 

In analogy with the way in which different physical measurements can be combined to 
form joint likelihoods, other explanatory criteria can be combined in different ways using 
the multiplicative prescription. We therefore define a generalised Bayesian evidence — let 
us call it gevidence for short — by 


P(a, b,c, ..:M) = [ rate: mnp ores mypclo:M)..11@: Md (21.7) 


where the letters a, b, c, ... refer to different valuation prescriptions corresponding to 
explanatory criteria. It is also useful to define the log-quantity 


Labe.. = log P(a, b,c, ...;M), (21.8) 


which is more useful when performing model comparison, since the log-quantities 
add/subtract between models. 

While any valuation measure in itself cannot be ‘proven’, just like for a parameter prior 
it can be founded on theoretical principle and experience. The novel measures are fun- 
damentally no different from conventionally used priors on model parameter space — they 
simply generalise to higher orders of model characteristics. The proposed construction pro- 
vides a prescription for how to carry out model comparison on the basis of such measures. 
It is not primarily intended as a tool to exclude models, but rather a means of maintain- 
ing a principled and systematic approach to comparing models, given certain assumptions. 
Note, however, that it is possible to perform inference on the explanatory criteria through 
re-conditionalising, as discussed in Section 21.5.4. 


21.5.3 Evidence, Elegance, Beneficence 


While the proposed construction does not remove epistemic ambiguity, it provides a ratio- 
nal, natural and well-defined framework for examining this ambiguity in a systematic, 
rational and explicit way to determine relative model fitness. A rough way to categorise 
the possibilities is on the basis of empirical/epistemic domain. A practical way to clas- 
sify models is then by theoretical/mathematical structure and physical possibility space. 
This can roughly be translated to correspond to aesthetic and ethical principles. While 
this terminology may appear unorthodox, it emphasises the axiological element which is 
present regardless, but does not therefore imply the presence of any aesthetic or moral 
agent. Model inference thus divides into interlinking empirical, aesthetic and ethical com- 
parisons. This classification may also be useful in that it reflects how scientists intuitively 
tend to approach informal model assessment. To structure the problem, let us therefore 
represent the gevidence in the specific form 


PODA.E:M) = [ £(DI0: M)p(Al0; Myp(EI0 M)T1@; Md (21.9) 
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Evidence 
P(D; M) 


P 


Elegance Beneficence 
P(A; M) P(E; M) 


Figure 21.1 A schematic axiological Bayesian triangle representation of model evidence P(D; M), 
elegance P(A; M), and beneficence P(E; M). 


where D denotes ‘data’, A denotes ‘aesthetics’, and E denotes ‘ethics’. We thus represent 
models on a direct product set of empirical observables, model structure and model possi- 
bility space. Each of these probabilities may in themselves be subdivided by multiplication 
into any number of component valuations. 

We may also form the partial gevidences P(A, D; M), P(A, E; M), and P(D, E; M), which 
provide additional information about the ways in which the different explanatory criteria 
corroborate or contradict each other. We shall refer to the individual gevidences as evidence 
[P(D; M)], elegance [P(A; M)], and beneficence [P(E; M)]. The log-quantities Lang, Lap, 
Lar, Loe, Lp, La, and Lz will also be of interest for model comparison. 

This quantification offers a formalisation of model comparison and explanation across 
the categories, and hence also of problem formulation: just as an unexpectedly rare 
state in model phase space (fine-tuning) may prompt explanation, e.g. an unexpectedly 
un-aesthetic/aesthetic model structure may prompt explanation. 


21.5.4 Implementation and Application 


Let us now turn to how, in practice, the type of framework proposed could be implemented 
and used. A few essential elements are needed: 


e Basis elements for lattice (representation of models); 

e Aesthetic measure/s on model-structure space; 

e Ethical measure/s on model possibility space; 

e Computational capability to evaluate on model-structure space and possibility space. 
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Figure 21.2 A schematic axiological Bayesian circle representation of model gevidence, useful for 
model comparison. The implicit model conditioning has been dropped in the figure. The figure is 
divided into a positive and negative half-plane. There are three axes on which are crosses to indicate 
the values of Lp, L4 and Ly. These axes and the axis separating the half-planes divide the figure into 
eight equal segments. Within these areas are plotted shaded circle segments with areas corresponding 
to the values of Lap, Lpg, Lag and Lape. The shaded circle segments are placed in the segment 
delineated by the axes corresponding to the two criteria in question, e.g. Lap is placed between the 
axes for Lp and Ly. In the case of AE, the half-plane axis forms one of the axes. Lang is plotted 
in the remaining segment. Positive values are indicated by placing the shaded circle segment in the 
positive half-plane, and negative values in the negative half-plane. (This is reinforced by choosing the 
orientation of the shading lines to distinguish positive and negative values.) The areas can be directly 
compared within figures for a given model, and between figures for different models, to indicate 
relative model fitness (a model difference figure can also be constructed in the same way). Recall 
that the larger the values are, the stronger the support is for a model. 


Out of these, the first three appear straightforward to achieve. Some aesthetic measures 
of e.g. simplicity are naturally incorporated in the conventional Bayesian statistical frame- 
work (Ockham’s razor), others may conventionally be used more informally. The proposed 
framework formalises model structure considerations beyond “parameter shaving’ with 
Ockham’s razor. A list of some possible aesthetic criteria (after Ellis, 2014) are shown 
in Figure 21.3. As an example, “connectedness to the rest of science’ might be quanti- 
fied on the basis of the different physical constants that appear in a model. Ellis and Uzan 
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Satisfactory structure 


(a) internal consistency, (b) simplicity (Ockham’s razor), 
(c) ‘beauty’ or ‘elegance’; 


Intrinsic explanatory power 


(a) logical tightness, 
(b) scope of the theory — unifying otherwise separate phenomena; 


Extrinsic explanatory power 


(a) connectedness to the rest of science, 
(b) extendability — a basis for further development. 


Figure 21.3 Example aesthetic model comparison criteria, after Ellis (2014). 


(2014) argue that Higgs inflation could be regarded as a preferred inflation model on such 
grounds. Ethical measures are not commonly discussed, although scientists (as everyone) 
will at some level be influenced by such considerations, for example due to their philosoph- 
ical position along the axis from materialism to idealism (ethics defined on exclusively 
materialistic or idealistic grounds can clearly differ significantly). Explicit consideration 
of ethics in relation to cosmology is given by Knobe et al. (2006), who discuss the eth- 
ical implications of inflationary cosmology, and in Murphy and Ellis (1996) where it is 
argued that scientific cosmology points toward a kenotic ethic. Computational capability 
on model structure space should be reasonably adequate with today’s technology. How- 
ever, computations of/on possibility spaces may present serious challenges especially for 
complex theories and measures. 

In practice, the framework can be used much the way models are compared in the light of 
different data sets both separately and jointly to test consistency and combined inferential 
power. Figure 21.1 shows schematically a simple way to illustrate the evidence P(D; M), 
elegance P(A; M) and beneficence P(E; M). A fuller representation is shown schematically 
in Figure 21.2, where the log-gevidences are shown for all possible combinations of cri- 
teria. This figure gives a complete picture of the model gevidence, and can be directly 
compared between models to understand which model is best overall, or with respect to 
combinations of only some of the criteria. This offers possibilities to explore multi-factor 
explanations, i.e. part empirical, part aesthetic, part ethical. It gives a clear picture to what 
extent empirical data and axiological criteria are consistent with each other. 

One may also examine which aesthetic and ethical criteria are ‘preferred’ with the 
help of the framework. By re-conditionalising the probability, one can compute e.g. 
P(D|A;M) = P(D,A;M)/P(A;M) which quantifies the empirical support for the ele- 
gance principle A (under model M). One could also separately study the support for 
particular explanatory criteria across some range of models of relevance by comparing 
P(C) = XmyP(C;M)M(M) for different criteria C, which thus effectively extends the 
empirical/epistemic domain to model structure and model possibility space. 
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21.6 Concluding Discussion 


In this chapter, it has been argued — in concordance with earlier observations by Ellis 
(2014), Smeenk (2014), and others — that 


1. Model inference in cosmology involves both evaluation of empirical statistical evidence 
and application of other interpretative principles. 

2. The Bayesian statistical framework, particularly, suffers from the measure problem in 
relation to explanation of global properties in a whole-Universe and Multiverse context 
(notably, inflationary initial conditions). 

3. Some interpretative principles are not in themselves empirically testable by conven- 
tional Bayesian statistical tests. 

4. Bayesian statistical explanation therefore is effectively qualitative in the whole- 
Universe and Multiverse context, such that Bayesian probability becomes ambiguous 
quasiprobability and the measure problem ill defined. 

5. It is possible to extend Bayesian statistical inference in a natural and well-defined 
way to explicitly account for non-observational explanatory principles and provide a 
conceptually well-defined inferential procedure. 


These considerations lead to the following conclusion. If we accept probability calcu- 
lus as founded on the lattice construction, then the conventional scientific method can 
be regarded as a special case of a more general part-subjective, but uniquely rational, 
framework for reasoning we have termed ‘axiological Bayesianism’. This framework gen- 
eralises Bayesian statistics to define a more general version of Bayesian evidence for model 
inference. We have called this ‘gevidence’ and divided it into three main sub-components: 
evidence, elegance, and beneficence. This enables the inclusion of probabilities based on 
valuation measures on model structure and possibility space, that combine in a unique 
way. The framework appears to have overlap with Dawid’s concept of non-empirical the- 
ory evaluation (Dawid, 2013), and to lend itself to the epistemic theory of justification 
called foundherentism (Haack, 1993), a synthesis of foundationalism and coherentism. The 
framework can be further justified by appealing to epistemological principles of unifor- 
mity/unity and consistency/coherence to be extended to new domains, i.e. model inference 
and comparison. 

Potential problems with the proposed inference approach arise if the rules of probabil- 
ity are themselves global empirical properties of the Universe, just like a physical law, 
since such a ‘probability law’ could be different in other parts of a multiverse. The quasi- 
probability nature of Bayesian statistics in our analysis suggests that the framework may 
be extended to alternative logical foundations for probability. It remains to be seen how the 
framework of axiological Bayesianism might be developed and applied in practice. The 
details of model representation, associated product-space construction and measures, as 
well as computational techniques, need to be worked out. A reference scale for gevidence 
differences would be desirable (perhaps based on the concept of information, see Skilling, 
2010). 
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When there are very limited data, it is inevitable in principle that some type of hermeneu- 
tic process comes into play when engaging in inference. We must then either accept addi- 
tional explanatory criteria (i.e. not based on data likelihood) as valid ‘scientific method’, or 
appeal to some additional principle that invalidates such criteria. One possible such princi- 
ple might be that the type of subjectivity inherent in axiological Bayesianism is outside the 
realms of science, and hence the framework is to be rejected. This is a perfectly valid posi- 
tion to take. However, since it was shown above that the subjective ambiguity is also present 
implicitly in the conventional Bayesian framework, this principle also excludes making 
statements about the measure problem using that same framework. One should then choose 
to stay silent on the matter, to remain consistent. Hence, the proposed framework appears 
to be an in principle necessary, conceptually consistent, and theoretically natural (though 
not necessarily unique) generalisation of the Bayesian statistical framework for addressing 
the measure problem and similar questions, for those who wish not to stay silent. 
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Testing the Multiverse: Bayes, Fine-Tuning 
and Typicality 


LUKE A. BARNES 


22.1 Introduction 


Theory testing in the physical sciences has been revolutionized in recent decades by 
Bayesian approaches to probability theory. Here, I will consider Bayesian approaches to 
theory extensions, that is, theories like inflation which aim to provide a deeper explana- 
tion for some aspect of our models (in this case, the standard model of cosmology) that 
seem unnatural or fine tuned. In particular, I will consider how cosmologists can test the 
multiverse using observations of this universe. 

Cosmologists will only ever get one horizon-full of data. Our telescopes will see so far, 
and no further. At any particular time, particle accelerators reach to a finite energy scale 
and no higher. And yet, it would be an unnatural constraint on our theories for them to fall 
silent beyond the edge of the observable universe and above a certain energy. Natural, sim- 
ple theories need not confine themselves to the observable. How do we speculate beyond 
current data? 

In particular, how do we evaluate (what I will call) theory extensions? That is, physical 
theories whose main attraction is that they provide a deeper, more natural understand- 
ing of some effective theory. For example, the appeal of cosmic inflation is its natural 
explanation of some of the “initial conditions” of the standard model of cosmology. The 
postulates of the standard model — a homogeneous and isotropic Robertson—Walker (RW) 
spacetime, a set of energy components and their densities (matter, radiation and a cosmo- 
logical constant), and an initial set of adiabatic, Gaussian density and tensor perturbations — 
can explain all (or almost all) the cosmological data at our disposal: the expansion of the 
universe, big bang nucleosynthesis, the angular power spectrum of the cosmic microwave 
background (CMB), the galaxy and Lyman alpha forest power spectra, the baryon acoustic 
oscillation (BAO) scale, the luminosity distance-redshift relation of type Ia supernovae, 
and more. 

So, why not simply declare cosmology to be finished? We have a model that explains 
all the data. Consider the following kind of reason for extending our cosmological theory. 
In the standard model of cosmology, photons in the CMB that are separated in the sky by 
more than ~ 1 degree were scattered by patches of gas that have never been in causal contact 
with each other. And yet the entire CMB is at the same temperature, to one part in 100,000. 
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If, alternatively, we propose that there was a period of accelerating expansion in the very 
early universe, then the regions we see in the CMB have been in causal contact, allowing 
them to come to thermal equilibrium. And thus, inflation solves the horizon problem, so 
the standard story goes. 

Note well: the horizon problem does not involve a theory failing to predict an obser- 
vation. Theories never predict their initial conditions. Rather, we argue that something 
about our model is open to a deeper explanation because it is unnatural, improbable, or an 
unexplained coincidence. 

Examples could be multiplied. General relativity explains what to Newtonian gravity 
was a bare postulate: the equivalence of inertial and gravitational mass. Supersymmetry 
does not currently explain any data, but would explain why quantum corrections do not 
drive the Higgs Boson mass to the Planck scale, a fact which would otherwise be highly 
unnatural. 

A calculation is required to make these arguments robust. Returning to inflation: how 
probable is an isotropic CMB given inflation, and not given inflation? And how simple is 
inflation as a hypothesis, given that we do not know what the inflaton is? How generic 
(probable?) are the initial conditions that lead to inflation? Observations can tell us some- 
thing about the initial conditions of the observable universe; when should we accept a 
dynamical theory of those initial conditions, rather than simply postulating them? 

Can we attack these questions with probability theory at all? Cosmology promises 
to stretch our interpretation of probabilities. It will be my contention here that objec- 
tive Bayesian probabilities provide a consistent framework for extrapolating cosmological 
theories beyond our universe, and isolate the pertinent questions to ask such theories. 


22.2 Objective Bayesian Probability 
22.2.1 Probability from Uncertainty 


We will start with an (oversimplified) overview of probability, and in particular my impres- 
sions of how it is used in the physical sciences. The interpretation of probability has a long 
and surprisingly turbulent history. In one corner stand the frequentists, for whom probabil- 
ities measure the relative frequencies of events in hypothetical infinitely repeated trials (or, 
for finite frequentists, in actual, known trials). When a scientist wants to test their ideas, 
they calculate the probability of the data given the theory. If this probability (known as a 
likelihood) passes certain tests, then we can announce that the theory is not disconfirmed. 

The mathematical foundation of this approach was provided by Kolmogorov (1933), 
who builds probability theory from mathematical axioms, independent of any particular 
application to statistics. Probability, like tensor calculus or conic sections, is a tool that 
may or may not be useful to the scientist in the investigation of some physical systems. 

If probabilities are frequencies of outcomes, it makes no sense to ask for the probability 
of a theory. We cannot compare the number of universes that obey Newtonian gravity with 
the number that obey Einstein’s General Relativity. This is not a criticism of frequentism 
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by its opponents. Ronald Fisher, the patron saint of frequentism, stated that “we can know 
nothing of the probability of hypotheses or hypothetical quantities” (Fisher, 1921). 

In the other corner stand the Bayesians.! The basis of this approach is not abstract 
axioms but an attempt to start from the desiderata of rationality and develop probability 
theory as generalized logic. While classical logic is concerned with what follows deduc- 
tively — if A then B — probability theory will include weaker degrees of certainty — if A 
then probably B. Probabilities such as p(B|A) (“the probability of B given A”) quantify 
the degree of certainty of the proposition B given the truth of the proposition A. Classi- 
cal logic’s implication A — B is the special case p(B|A) = 1; those two are the same 
statement. The goal is not merely to quantify subjective degrees of belief, that is, the psy- 
chological state of someone who believes A and is considering B. Just as classical logic’s 
A — B says nothing about whether A is known by anyone, but instead denotes a connection 
between the truth values of the propositions A and B, so p(B|A) quantifies a relationship 
between these propositions.” 

How should degrees of certainty be assigned to certain propositions? Jaynes (2003) 
invites us to imagine a reasoning robot: insert a given proposition A in one slot, and the 
proposition of interest B in the other slot, and out comes a number indicating the degree of 
certainty. We program the robot according to the following desiderata: 


D1. Probabilities are represented by real numbers. This ensures that degrees of plausibility 
can be compared on a single scale. 

D2. Probabilities change in common sense ways. For example, if learning C makes B more 
likely, but does not change how likely A is, then learning C should make AB more 
likely. 

D3. If a conclusion can be reasoned out in more than one way, then every possible way 
must lead to the same result. 

D4. Information must not be arbitrarily ignored. All given evidence must be taken into 
account. 

D5. Identical states of knowledge (except perhaps for the labeling of the propositions) 
should result in identical assigned probabilities. 


Perhaps surprisingly, these desiderata are enough. Cox’s theorem (Caticha, 2009; Jaynes, 
2003) are required reading) shows that quantities assigned according to these desiderata 
obey the same rules as probabilities. In particular, we have a rule for each of the Boolean 
operations “and” (AB), “or” (A + B) and “not” (A), 


p(AB|C) = pl(A|BC) p(BIC) = p(BIAC) p(A|C) (22.1) 
p(A + BIC) = p(A|C) + p(BIC) — p(AB\C) (22.2) 
p(A|C) = 1—p(Alc). (22.3) 


! It is a simplification to speak of just two corners, but sufficient for our purposes. 
2 Neither are we considering degrees of truth; A and B are in fact either true or false. 
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These are identities, holding for any propositions A, B and C for which the relevant 

quantities are defined. In particular, from Eq. (22.1) we can derive Bayes’ theorem, 
p(A|BC) = PO) : (22.4) 

P(B|C) 

Bayes’s theorem often comes attached to a narrative about “prior” probabilities, which 

depend only on “known” “background” information (or worse, temporally prior informa- 

tion), that is updated with new “data” to produce revised “posterior” probabilities. None 

of this is essential to Bayesianism. 

The goal of Bayesian probability theory is to calculate the probability of the proposition 
of interest A, given everything we know K. If you are handed p(A|K) from the clouds, then 
your work is done. If, however, p(A|K) is too much to handle then you will have to break 
it into smaller pieces. In particular, the sum total of everything you know K is likely to be 
expressible as a conjunction, K = BC, in which case Bayes’s theorem is very useful. We 
use probability identities to write probabilities we want in terms of probabilities we know. 


22.2.2 The Rise of Bayesianism 


A revolution in the physical sciences over the last few decades has transformed what we do 
with data. New methods have been advanced because of a fundamental change in the way 
that scientists view probability. From these new foundations have come a new approach 
and a new set of tools, all marching under the banner of Bayes. 

To underscore the dominance of Bayesian probability theory, a recent NASA Astro- 
physics Data System (ADS) search of the astronomy and physics literature for articles 
with the word “Bayesian” or “Bayes” in the title returned 7555 papers. A search for “fre- 
quentist” or “frequentism” in the title returned 71 papers, half of which also have “Bayes” 
in the title. Most of these are comparing methods. Frequentist methods are still used, and 
will not always be advertised as such. Nevertheless, this does show how few physicists and 
astronomers advertise their methods as frequentist. I have never seen frequentism defended 
in a scientific paper. On the rare occasions that the word appears, it is usually as a synonym 
for “oversimplified” or “archaic” or “wrong”. 

Why has Bayesianism risen so quickly in the physical sciences? I think that there are 
two main reasons. 

Firstly, Bayesianism makes good sense of theory testing. Figure 22.1 shows the con- 
straints from data from the Planck CMB satellite (Planck Collaboration et al., 2015) on 
the average cosmic density of matter, relative to the critical density. The y-axis shows the 
probability (density) of a particular value of the parameter, normalized to the maximum 
value. 

What exactly does the y-axis quantify? It is not a finite or hypothetical frequency — it’s 
not saying that ~95 per cent of universes we polled (or would hypothetically poll) have a 
mass density parameter between 0.27 and 0.36. The width of the peak is not an indication 
of the range of matter densities in different regions of the universe. It is not a chance, as 
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Figure 22.1 Constraints from the Planck CMB satellite on the average cosmic density of matter. 
The y-axis shows the probability (density) of a particular value of the parameter, normalized to the 
maximum value. 


if the density of the universe is a stochastic property that every third Sunday of the month 
is less than 0.27. The universe only has one value of its average density, and so knows of 
only one point on the x-axis. 

The y-axis of this plot most plausibly quantifies our degree of certainty. And yet, this 
is not a subjective credence. The Planck data analysis team is not reporting the effect that 
their satellite’s instruments have had on their state of mind. What this plot reports is the 
implications of cosmological data for the knowledge of cosmological parameters. 

More generally, science must be able to conclude, for example, that quantum mechanics 
is more likely to correctly describe atoms than classical electromagnetism. (Otherwise, 
what is the point? We would never learn anything.) This probability must be a statement 
about propositions, about states of knowledge. It cannot be a statement about frequencies 
or chances, because it is not a statement about the universe at all, or even a hypothetical 
ensemble of universes. Nature knows nothing of our incorrect theories. 

This does not mean that frequencies and chances are useless. A frequency is a useful 
way to describe data. Chances are legitimate postulates of a physical theory, for example 
in describing the macroscopic state of a thermodynamic system or the indeterminacy of 
quantum systems. Bayesian probability theory does not imply that quantum probabilities 
are epistemic, or that statistic mechanics needs only human ignorance to link microphysics 
with thermodynamics. Rather, the claim is that frequencies and chances are insufficient for 
testing theories. 

Secondly, the practice of Bayesian statistics exhibits a deep clarity and unity. The meth- 
ods of orthodox statistics are a grab-bag of techniques, each intuitively reasonable but 
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without any deeper insight into which is the best, or even of what “best” should mean. 
For example, Jaynes (2003) reports that, faced with linear regression (with both variables 
subject to an error of unknown variance), the orthodox textbook of Kempthorne & Folks 
(1971) formulates 16 different methods, and, being unable to choose between them, con- 
cludes with “It is all very difficult”. A later survey of orthodox methods can “give only a 
long, somewhat dreary, list of one adhockery after another, with no firm final conclusions”. 
In contrast, the Bayesian approach gives the scientist the impression of asking the right 
questions of the data, with no hidden assumptions and no black boxes. 


22.2.3 Has Bayesianism Succeeded? 


The claim of the Bayesian is that there are objective degrees of certainty or credences 
that can be modeled as probabilities. They are neither frequencies (actual or hypothetical), 
chances, nor merely subjective. 

The reader might, and probably (!) should, be skeptical as to whether such an ambi- 
tious quantification of reasoning has indeed been achieved by the Bayesians. It might seem 
like alchemy, turning the base metal of ignorance into the gold of a precise probability 
distribution. Keep in mind, however, that Bayesian probabilities do not imply statistical 
frequencies: it does not follow from p(B|A) = 0.5 that there is a population of As that we 
could sample, half of whose members are Bs. 

Further, Bayesian probabilities do not quantify everything that A says about B. Suppose 
that a mystery black box will flip a coin. What is the probability of heads H, given this 
information (A)? The Bayesian has no reason to prefer one side to the other; in particu- 
lar, the coin and/or box might be biased towards one side, but we do not know which. To 
reflect this ignorance, we assign p(H|A) = 0.5. Now suppose that we examine the coin and 
box, and discover that the coin is (as best we can tell) perfectly symmetric and unbiased, 
and inside the box we find a mechanism that has shown no evidence of bias in the last 
billion flips. What is the probability of heads H, given this new, detailed information B? It 
has not changed: p(H|B) = 0.5. Should the Bayesian be worried that the probability does 
not reflect the vast difference in the information in A and B? Should we seek to expand 
probability to take into account this difference, using fuzzy probabilities or assigning dis- 
tributions rather than numbers? Perhaps. But the unchanged probability is in some sense 
the right answer. Sure, we have learned a lot about the coin and the box, but this knowledge 
should not have changed our belief that heads will turn up.° 

The assignment of probabilities is not derailed by ignorance. Ignorance is a state of 
knowledge, and probabilities describe states of knowledge. It may seem like assigning 
p(A|A) = 0.5 using the principle of indifference is misleadingly precise. We should reserve 
definite probability assignments for cases like B, and should instead say of A that “I do 


3 It will, appropriately, change the probability that the coin is biased, given a sequence of flips. Given A, a series 
of repeated heads will quickly convince us that the coin is biased. Given B, we will resist such a conclusion for 
longer, believing in the light of our examination of the coin and box that the repeated heads are mere chance. 
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not know”. But this would sell ourselves short. “I do not know which one of these two 
statements is true” is a very different state of knowledge from “I don’t know which one of 
these trillion statements is true”. Our probabilities can and should reflect the size of the set 
of possibilities; the principle of indifference is invoked as a special case when this size is 
all we have. The assigned probabilities are only misleadingly precise if overinterpreted. 

Nevertheless, Bayesian probability theory is not without worries. Some are pseudo- 
problems, such as the “problem of old evidence” (Glymour, 1980).* More troubling is 
the assignment of prior probabilities. Recall that prior probabilities are simply probabili- 
ties calculated using less than everything we know. So the problem is really: how do we 
assign probabilities when we do not know very much? The problem of the prior is particu- 
larly acute when faced with a continuum of possibilities, such as a probability distribution 
over a variable. We cannot say that each value is equally probable, or that each interval 
in an infinite range is equally probable, since these distributions do not sum (integrate) to 
one. The probabilities are worryingly shuffled by a change in variable. How do we model 
ignorance of an infinite number of possibilities? 

Various methods have been advanced to solve this problem, including Jeffrey’s prior 
and Jaynes ef al.’s principle of maximum entropy. Whether these are successful is 
beyond the scope of this paper, but their failure would not sink Bayesianism. It would 
leave an open problem in the program. The most that the Bayesian might have to give 
up in light of these worries is that probabilities can be assigned to any proposition 
given any state of knowledge. For example, it seems absurd to suppose that there is 
such a thing as the probability that “the toilet paper is purple” given that “the plate is 
orange,”> that there is some number that uniquely captures the relationship between those 
propositions. 

Faced with infinite possibilities, or vague statements about purple toilet paper, we might 
have to refrain from assigning a probability until more information is given. Jaynes (2003) 
argues that the problem of infinities is similar to the problem of vague statements — we 
have not really specified the problem until we know the limiting procedure that generates 
the infinity. Where one should draw the “too vague” line, however, is not clear. 


22.3 Extending the Laws of Nature (As We Know Them) 
22.3.1 Taking Stock 


We want to apply Bayesian probability theory to the extension of the laws of nature, and 
then in particular to the multiverse. First, we must take stock of the laws of nature as 
we know them. We consider the somewhat idealized case in which we have identified 
the effective laws of nature that govern the physical regimes relevant to our observational 
evidence. Let, 


4 Exercise for the reader. Hint: P(E|B) = 1 does not follow from “I know E”. 
5 Thanks to Eric Winsberg for this example. 
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e U=our observations of this universe; 
e B=everything else we know; 
e L =the laws of nature as we know them. 


U represents the sum total of our observations of this universe, including every telescope 
observation and every experiment. B represents everything else we know, such that UB 
represents everything we know. B includes mathematical knowledge, and in particular all 
of theoretical physics. A statement such as “a bound test particle moving according to 
Newton’s law of gravity would obey Kepler’s laws of planetary motion” is true even if no 
particles actually obey Newton’s law. It is not a statement about the actual world. 

Regarding L, I am thinking here of the Lagrangian of the standard model of parti- 
cle physics plus general relativity, but the details would not much matter. In a typically 
entertaining footnote, David Griffiths imagines “that God has a giant computer-controlled 
factory, which takes Lagrangians as input and delivers the universe they represent as 
output” (Griffiths, 2008, p. 373). 

Actually, we need more than just the functional form of the Lagrangian. The equations 
of the laws of nature — as we know them — contain free parameters, numbers which are not 
predicted by the theory itself, but without which the laws are not fully specified. In addition, 
it is the solutions to the equations that describe a possible universe. We require further 
parameters to specify a particular universe from among the family of possible universes. 
These are usually specified as initial conditions, or more generally, boundary conditions. 

We will represent the free parameters of the laws of nature, referred to as the constants of 
nature, as the set of numbers a; Similarly, we will represent the initial conditions required 
to specify a solution/universe by® B,. The subscript L is a reminder that it is only in the 
context of a particular theory that a measurement of our universe becomes a fundamental 
constant. 

We wish to evaluate the probability of our theory L, given the evidence we have p(L| UB). 
We use Bayes’ theorem: 

p(L|B) 

p(U|B) 
However, L is missing its parameters, and will not predict quantities until they are specified. 
We can introduce the free parameters az, 8B; as nuisance parameters, to be integrated out: 


p(L|UB) = p(U|LB). (22.5) 


_ PIB) 


L|UB 
P(L|UB) p(UlB) 


p(U|a,B,LB)p(a_B,|LB) daz dB_. (22.6) 
A few points to note. The first term on the top is the “prior” probability of the law L, 
p(L|B). This is the probability that L describes this universe, given no information about 
this universe. Here is the place to formalize and implement Occam’s razor — we expect sim- 
pler theories to be more probable (the interested reader is encouraged to consult MacKay, 
2003, Chapter 28). 


6 The notation can be easily extended to functions or more advanced mathematical structures than lists of 
numbers. 
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The first term inside the integral (the likelihood) is where the theory, equipped with the 
appropriate constants and initial conditions, shows its predictive power by predicting obser- 
vations. The second term inside the integral is the prior probability of the free parameters, 
that is, the probability of the parameters falling into a certain range, given no informa- 
tion about this universe. Note that this term takes L as given — the parameters have no 
law-independent meaning. 

Our observations of the universe not only constrain L but its free parameters. We can, 
with a slight abuse of notation,’ denote by oe and py the set of free parameters consistent 
with experiment, such that, 


pled BY |LUB) > p(ol BY |LUB). (22.7) 


Our goal as physicists is to identify the laws of nature that govern our observations of 
the universe. Ideally, L describes our observations better than any rival theory, p(L|UB) > 
p(L|UB), and while there exists a range of candidate deeper theories into which L could 
be embedded, none is significantly preferred by our data. We do not assume that L is the 
ultimate law of nature. 


22.3.2. Why Extend the Laws? 


One particular way in which we would like a deeper physical theory to differ from current 
theories is with regard to the constants of nature. In particular, we want them gone, and 
we can see why from Eq. (22.6). A sharply peaked p(U|a,8,LB) as a function of the 
free parameters (w,, B;) is precisely what physicists usually mean by “fine-tuned” — if 
the theory only adequately explains the data for a very narrow range of its free parameters, 
then we are suspicious. To illustrate in the one-dimensional case (Figure 22.2), suppose that 
the prior p(@|LB) is non-zero over a range ~Ry, and the likelihood p(U|@LB) is sharply 
peaked in a range of values Aq, and negligible outside (i.e. remembering that the prior is 
normalized over a, but the likelihood is not). Then when we integrate over the nuisance 
parameter a, p(U|LB) ~ Aa/R,. Unless the prior probability is fortuitously peaked in the 
same range, the likelihood p(U|LB) will be very small. 

The discovery that a theory is fine tuned opens the door for an alternative theory to 
replace it. This theory could have a broader likelihood, a narrower prior, or have no free 
parameters at all. Note that a preference for such a theory is not merely aesthetic, nor 
simply the desire to summarize the behaviour of nature as succinctly as possible. 


7 Specifically, there are two abuses of notation. We are using a and py to refer to parameter regions, whereas 
before a, and B, referred to particular values. Secondly, we should be placing propositions into our probability 


functions. We can think of al as representing the proposition “the value of the fundamental constants of the 


theory L lie in the region al» 
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Figure 22.2 A Bayesian picture of a fine-tuned theory. p(U|LB) = f p(U|a@LB) p(a|LB) da ~ 
RE KI. 


22.3.3 How to Evaluate a Theory Extension 


So, we seek an extension to the laws of nature in which fewer arbitrary constants appear. 
How does the Bayesian evaluate theory extensions? 

Consider, as an example, a detective entering a crime scene. She relies on background 
evidence B (what she knew before she entered the room), and inside the room she collects 
evidence E. The evidence clearly indicates that K, a local thug, is the killer: p(K|EB) >> 
p(K|EB). Still, there may be puzzling or suspicious aspects of the hypothesis K; perhaps K 
did not know the victim. We thus are led to consider other propositions; not rival theories, 
but extensions to K. We might wonder whether K was (C) contracted to kill the victim by 
a local mob boss. We can evaluate this extended hypothesis in light of the data as follows: 


p(CK|EB) = p(C|KEB)p(K|EB). (22.8) 


Now, we suppose that C does not explain the evidence of the crime scene beyond the 
hypothesis K, p(E|CKB) = p(E|KB). That is, C seeks to explain K, and K explains E. For 
example, K’s fingerprints at the scene are not rendered more or less probable by his status 
as a contract killer. We can then write, 


p(K|CB) 
CK|EB) = ————p(C|B)p(K|EB). 22.9 
p(CK|EB) PKB) p(C|B)p(K|EB) (22.9) 


There are three factors of interest here. The first fraction denotes the probability of K being 
the killer given the contract hypothesis (and B), relative to the probability of K being the 
killer given background information alone. This is where the theory extension shows its 
worth, by leading us to expect that K would kill the victim. The second term is the prior 
probability of C, p(C|B); the theory C is penalized if it is implausible given the background 
information. Thirdly, p(K|EB) is the posterior probability of K, which by hypothesis is 
close to one. 
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22.3.4 Extending the Laws of Nature 


Consider an extension to the laws of nature. We consider a deeper theory 7, which aims to 
explain the laws and constants of nature as we observe them Lay py (i.e. for convenience, 
we will write LY, = Lal pv to denote the whole “laws + parameters” package). We 
assume that this deeper theory does not explain the data we observe beyond its ability 
to explain L, that is, p(U [TLpB) = p(U LypB)- For example, let Lip be the standard 
model of cosmology, beginning just prior to nucleosynthesis, and let T be inflation, which 
ends well before nucleosynthesis. Our prediction of the statistical properties of the CMB 
needs only Ds inflation does not predict the properties of the CMB beyond predicting the 
“initial conditions” of the standard model of cosmology. 
The formalism is then analogous to the crime scene case above: 


P(Lig|TB) 


p(TLY,|UB) = —“2 — 
i p(Lu |B) 


p(T|B)p (Lig |UB). (22.10) 


We can expand the fraction above, 


p(o} BY |TLB) p(L|TB) 
plo! BY \LB) p(L|B) 


P(TLyg|UB) = p(T|B)p(Lig|UB). (22.11) 
This is similar to the Bayesian formalism by which theories are tested with data, except 
that we are testing the theory extension T by using the effective theory and its measured 
constants Ly, as if they were data. Equation (22.11) highlights three questions to ask of 
any proposed extension to the laws of nature as we know them. Firstly, given the theory 
T, the effective laws of nature L and background information B, how probable are the con- 
stants and initial conditions of our universe? Secondly, given the theory T and background 
information B, how probable are the effective laws of nature L? Finally, given background 
information B, how probable is the theory T? 
Let us look at some ways to do away with free parameters. 


22.4 Extension 1: Replace Free Parameters with Mathematical Constants 


To some, free parameters are a call to action, a hot poker in the Bayesian posterior. We 
are not satisfied, and we will not be satisfied until every physical measurement can be pre- 
dicted from theory alone. Einstein (1949) dreamed of a set of equations such that “within 
these laws only rationally completely determined constants occur (not constants, therefore, 
whose numerical value could be changed without destroying the theory)”. 

In our formalism, this theory would set Pla |TB) = 1: given the deeper theory, there 
is only one low-energy effective theory with only one possible value of each “constant”. 
Measuring the constants of nature would be akin to drawing a circle and determining its 
radius and circumference in order to “measure” zr. 

Unifying scientific theories can reduce the number of free parameters in physics. For 
example, Maxwell’s unification of electricity and magnetism showed that c = 1/,/e9u0 (c 
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speed of light, €9 vacuum permittivity, 49 vacuum permeability), thus reducing the number 
of free parameters of physics by one. This is a step in the right direction, but the progress 
of science can just as easily increase the number of constants by, for example, discovering 
a new fundamental particle. 

Einstein’s dream is not without its worries. A “perfect” unity likelihood is often a clue 
that the theory is ad hoc or jerry-rigged. For example, a theory with a large number of 
siblings — that is, mutually exclusive but similar theories that are equally probable given 
our background information — will only receive a small slice of the total prior probability 
of the family. This is, in essence, why theories with free parameters are suspicious in the 
first place. The theory can be thought of as a large family of theories, one for each value of 
the free parameter. 

Thus, we need to worry about the prior probability of our deeper theory p(T|B). It may 
have no free parameters, but if it is but one member of a large set of similar theories, the 
prior probability may still be small. In particular, while by hypothesis we cannot vary the 
parameters of the theory, this may merely indicate that we must look for fine-tuning at 
the next level deeper, as it were. Varying the effective parameters of our laws may require 
varying the deeper theory, leaving us no less at the mercy of a large set of possibilities. 

This highlights one of Steven Weinberg’s wishes in “Dreams of a Final Theory” 
(Weinberg, 1993), which he calls logical isolation. Weinberg argues that, while quantum 
mechanics is not logically inevitable, “any small change in quantum mechanics would 
lead to logical absurdities” (p. 70). In this sense, there is no obvious continuum of theories, 
of which quantum mechanics is just one. The Bayesian argument above fits nicely with 
Weinberg’s intuition. Total logical isolation, however, seems too much to ask. Mathemati- 
cal consistency is not trivial, but neither is it a rarity. There is no ultimate equation of our 
physical universe to which we can hope to say “mathematically, that’s how things must be”. 

In addition, a theory that requires no initial conditions, or that somehow predicts its own 
initial conditions, would be rather strange. Rather than specifying the dynamical properties 
of physical objects in the form of counterfactuals, it would specify the state of the universe. 
For example, a Newtonian version of such a theory would not state that if two masses 
(m1, m2) are separated by distance r, then they would experience a force with magnitude 
Gm,mz/r*. Rather, it would specify position as a function of time r(t) for each particle 
in the universe. Rather than the complexity of the phenomena of the universe giving way 
to simple fundamental laws, a theory with no initial conditions would seem to require 
complexity all the way down. 


22.5 Extension 2: Replace Free Parameters with Dynamical Entities 


We have expounded the ingredients of physical theories as we know them: laws, constants 
and initial (or boundary) conditions. The laws describe dynamical entities — fields, parti- 
cles, spacetime, etc. So, one way in which a constant could disappear in a deeper theory 
is by changing identity to become a dynamical quantity. The fine-structure constant, for 
example, could be the local value of a field. We can test this hypothesis by looking for 
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changes in the value of the fine-structure constant over cosmic time and cosmic distances. 
To date, no convincing variation has been found (Cameron and Pettitt, 2012; King et al., 
2012; Webb ef al., 2011; Whitmore and Murphy, 2015). 

Two problems immediately arise. Firstly, if the fine-structure constant is replaced by a 
quantum field, then it seems that we have merely replaced one constant with the parameters 
that describe the field (i.e. in fact, a field that varies so slowly over the observable universe 
requires a very low mass). Secondly, even if we could replace our constants with a totally 
constant-free field, this does not seem like progress. We have replaced a single number 
with a function: an infinite collection of numbers, one attached to each spacetime point. If 
we are in a typical place in the universe, then there is no further rationale for the value of 
the “constant” that we observe. There is some function that varies across spacetime, and 
we happen to be in the part that has a © 1/137. 


22.5.1 The Fine-Tuning of the Universe for Intelligent Life 


However, there are good reasons to believe that we are not in a typical place in the universe. 
The universe is not an experiment. We are not Dr. Frankenstein, setting up our equipment, 
choosing the initial conditions, and observing the set-up at our leisure. We are the monster 
— we have awoken in a laboratory and are trying to figure out how it made us. Not all rooms 
can create a monster, so the fact that we are observing at all is a very stringent constraint 
on the contents of the room. 

Similarly, not all laws of nature can be scientific laws, because not all laws of nature 
create scientists. There are certain equations that will not be written on a chalkboard in 
any universe that they describe. If the evolution of conscious observers shows a strong 
preference for certain laws or certain regions of parameter space, then an explanation for 
the values of the constants naturally arises. The reason why this set of constants exists at 
all is that there is a sufficiently large number of universe domains, with enough variation 
in their properties that at least one of them would hit on the right combination for life. The 
reason why we observe that we are in one of these rare regions is that we could not be 
anywhere else. 

Beginning in the 1970s, a number of physicists have noticed the extreme sensitivity of 
the life-permitting qualities of our universe to the values of many of the physical constants 
and cosmological parameters of our universe. Seemingly small changes in the free parame- 
ters of the laws of nature as we know them have dramatic, uncompensated and detrimental 
effects on the ability of the universe to support the complexity needed by physical life 
forms. I have elsewhere reviewed the scientific literature on the fine-tuning of the universe 
for intelligent life (Barnes, 2012). Here are a few examples. 


e The existence of structure in our universe at all places stringent bounds on the cos- 
mological constant. Compared to the range of values for which our theories are well 
defined — roughly + the Planck scale — the range of values that permit gravitationally 
bound structures is no more than one part in 10!!°. 
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e A universe with structure also requires a fine-tuned value for the primordial density con- 
trast Q. Too low, and no structure forms. Too high and galaxies are too dense to allow 
for long-lived planetary systems, as the time between disruption by a neighbouring star 
is too short. This places the constraint 10-° < Q < 10~* (Tegmark and Rees, 1998). 

e The existence of long-lived stars, which produce and distribute chemical elements and 
are a stable source of energy that can power chemical reactions, requires an unnaturally 
small value for the “gravitational coupling constant” ag = ein (ice or, equiv- 
alently, that the proton mass be orders of magnitude smaller than the Planck mass. For 
stars to be stable at all, we require ag S 10-33 (Adams, 2008). 

e The existence of any atomic species and chemical processes whatsoever places tight 
constraints on the relative masses of the fundamental particles and the strengths 
of the fundamental forces. For example, Barr and Khan (2007) show the effect of 
varying the masses of the up and down quark, and find that star-and-chemistry per- 
mitting universes are huddled in a small shard of parameter space which has area 
Amup AMdown/ Wicnc ~ 107%, 


Note that these constraints are all multi-dimensional; I have quoted one-dimensional 
bounds for simplicity. See Barnes (2012) and references therein for plots demonstrating 
these and more constraints in multiple dimensions of parameter space. (It has never been 
the case that the fine-tuning literature has varied one variable at a time.) 

These small numbers — 10~!!9, 10-4, 10733, 10742 — are, in the Bayesian fashion, an 
attempt to quantify our ignorance. We are not assuming the existence of a random universe- 
generating machine, nor describing the properties of a real or imagined statistical sample. 
The laws of nature as we know them contain arbitrary constants, which are not constrained 
by anything in theoretical physics. As usual, we can react to small probabilities in a couple 
of ways. Perhaps, like the probability of a deck of cards falling on the floor in a particular 
order, something improbable has happened. Enough said. Alternatively, like the probability 
that the burglar correctly guessed the 12-digit code by chance on the first attempt, it may 
indicate that we have made an incorrect assumption. We should look for an alternative 
assumption (or theory), on which the fact in question is not so improbable. 


22.5.2 Making Predictions in a Multiverse 


Theories are tested by their predictions, and we saw above that theory extensions are tested 
by their ability to predict the effective laws and constants of nature. In practice, this means 
calculating likelihoods. 

The multiverse is an example of a “population plus selection effect’ explanation. There 
is some observed outcome X to be explained, and X is highly improbable on any single trial. 
We postulate a large, varied population to explain why any X exists at all, and a selection 
effect to explain why we observe X. For example, the front page of the newspaper reports 
correctly that Keith won the lottery. The probability of any particular person winning the 
lottery is very small. This occurrence is made more probable if we suppose that there is a 
large number of lottery players buying different tickets, and that only a lottery win would 
be considered newsworthy. 
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Where is the relevant selection effect when we are attempting to explain the statement 
that the effective laws of nature are L and the associated free parameters are a, py? 
Recall that U represents everything that J know about this universe. Thus, to explain U, the 
proposition Lal py must refer to this universe, the universe that I inhabit. Lal pv cannot 
simply state that “there is at least one universe in which the law L holds and in which the 
constants are al, B;”, because this will not explain the fact that I observe U. 

This highlights an important difference in probability between calculating the probabil- 
ity that “this X is Y” and “there is at least one X that is Y”. Suppose I have just watched 
Alice deal herself five royal flushes in a row in a game of poker. The probability of these 
five hands being five royal flushes assuming a fair deal is 107°, making us wonder if Alice 
is cheating. The probability that someone, somewhere has fairly dealt five royal flushes 
depends on the number of poker deals there have ever been anywhere in the universe. If 
the universe is infinite, then this probability is one, making it useless for deciding whether 
Alice is cheating. As the Bayesian desiderata state, information must not be arbitrarily 
ignored. Reasoning as if we only knew that “there is at least one instance of five royal 
flushes” is to discard information. 

Note that the correct distinction is not between first and third person probabilities, as is 
sometimes assumed in the multiverse literature. Third person probability can be as specific 
(“a particular X is Y’’) as first person probabilities. Also, there is nothing “mystical” about 
using indexical information in probabilities (Neal, 2006); “T’ can successfully select a par- 
ticular individual — in this case, the speaker of the sentence or calculator of the probability 
— without assuming that the individual is unique in reality on account of “some essence”. 

So, what is the likelihood that this universe has the observed constants, given a multi- 
verse theory? We can calculate this in two pieces. We first calculate the probability that 
observers exist at all in the multiverse (O). So long as observer-permitting universes have 
non-zero chances and the universes in the multiverse are sufficiently varied, this probability 
will approach unity as the number of universes increases. 

With an actual population of universes, the second probability piece is equal to a fre- 
quency: the fraction of observers (or observer moments) that observe our particular set of 
constants al BY. This will depend on two factors: the rate Robs (per unit time and volume 
dx“) at which observers/observations are made at a particular point in spacetime, given the 
values of the “constants”, and the probability of a particular set of constants at a particular 
spacetime point. Considering just the constants (a ): 


Nebs = / i Ros (x! |e, TLB) p(y |x TLB) dx!“ deez (2.12) 
Novs (oY) = / / Robs (x! |e, TLB) p(oey |x TLB) dx" doe (22.13) 
A 
Nops (ol) 


= p(«//|OTLB) = (22.14) 


Ni obs 
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The fine-tuning of the universe for intelligent life suggests that Rops is strongly peaked in 
our neighborhood of parameter space, meaning that while regions of the universe with our 
constants are rare, they may be likely (or at least, not too unlikely) to be observed. 

However, fine-tuning for life is not enough to ensure that a multiverse successfully pre- 
dicts our constants of nature. The form of life with which we are familiar came about 
through biological evolution, via a gradual build up of complexity over timescales that are 
orders of magnitude longer than the lifetime of any particular individual. Such life forms 
require a stable planetary surface, a stable star producing usable photons, a ready supply 
of chemicals and so forth. However, observers could form without this history and envi- 
ronment as thermodynamic fluctuations. These Boltzmann Brains can cause problems for 
a multiverse theory because they mean that Rops does not fall exactly to zero in seemingly 
hostile regions of parameter space. 

In Eq. (22.12), we can write Robs = Riite + Rpp to represent the contribution of both bio- 
logical life forms and Boltzmann Brains (BB) to the set of observers in a given multiverse. 
Thus, we can also write Nobs = Mite +Npp. We have, then, a competition between whether 
most observers (or observations) are made by common observers in rare conditions (life) 
or rare observers in common conditions (BB). 

In testing a multiverse, it matters what other hypothetical observers in the multiverse 
observe, since the likelihood is normalized over a,. Theories must place their bets as to 
what data are to be expected; for the multiverse, this means predicting what an observer 
will observe. While our calculation of the posterior involves evaluating the likelihood at 
our particular value of the constants in our universe, the normalization of the likelihood 
means that the more observers there are that do not observe what we observe, the smaller 
the likelihood. Every observer counts, not just those who observe exactly what we observe. 

Figure 22.3 presents a one-dimensional illustration of this Boltzmann observer problem. 
The problem is not that we might be Boltzmann Brains, or that most entities with my mem- 
ories are fluctuation observers. We can call that the Boltzmann Me problem, and set it to 
one side. The Boltzmann observer problem is a straightforward case of a failed prediction. 
A multiverse, once the full range of observers is considered, can be strongly disconfirmed 
by the seemingly innocuous observation that I am not a brain floating in empty space. The 
problem is not that we might be Boltzmann Brains; the problem (for the theory) is that we 
are not. 

Testing the multiverse thus requires an understanding of the conditions under which 
observers can fluctuate into existence. It is of particular interest whether quantum fluc- 
tuations in a vacuum can create observers; see Boddy et al. (2014) for the case against 
such observers. In broadly thermodynamic terms, the Boltzmann observer problem seems 
formidable. Biological life requires low entropy conditions in a large region; in fact, the 
entropy of this universe seems to be far lower than is required even by biological life 
forms (Eddington, 1931; Penrose, 2004). Boltzmann Brains, on the other hand, require only 
the smallest entropy fluctuation needed to create an observer. Given the usual connection 
between low entropy conditions and improbability, this would seem to make Boltzmann 
Brains far more numerous than biological life forms. 
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Figure 22.3 An illustration of the Boltzmann observer problem. The likelihood of the set of con- 
stants that we observe given a multiverse theory p(a_|OTLB) is normalized over az. In evaluating 
the posterior probability of the multiverse, we evaluate the likelihood at the observed value of the 
constant @ ops. Boltzmann Brains can exist in universes which are hostile to biological life forms, and 
so can be found in a much larger region of parameter space. The larger the area under the broader 
Boltzmann Brain contribution, the smaller the (renormalized) likelihood of a biological life form 
observing Qops.- 


We also face the measure problem, which in our formalism is the question of how to eval- 
uate the likelihood of a multiverse theory when the number of observers is infinite. Jaynes 
(2003, p. 486-7) warns that “attempts to apply the rules of probability theory directly 
and indiscriminately on infinite sets” leads to paradoxes, and that the only cure for this 
disease is that “an infinite set should be thought of only as the limit of a specific (i.e. 
unambiguously specified) sequence of finite sets... . The mathematically generated para- 
doxes have been found only when we tried to depart from this policy by treating an infinite 
limit as something already accomplished, without regard to any limiting operation”. The 
problem for an infinite multiverse is that there is no such limit — the infinity in question 
is “completed”, an actually infinite set of universes and observers. In such circumstances, 
our probability assignments cannot be invariant under permutations of the labels on the 
observers (Olum, 2012). Infinite multiverse modellers could try to manufacture a limiting 
process — perhaps a sequence of spacetime volumes — or justify restricting attention to a 
finite subset. 


22.5.3 Typicality and the Multiverse 


Testing the multiverse has often focused on typicality: a theory is to be preferred if it 
predicts that human observers are typical in some class of objects in the universe (Hartle 
and Srednicki, 2007). For example, suppose we derive from a multiverse theory 7 the 
distribution of observed values of some constant a: p(a|TB). T predicts that, with 95 per 
cent certainty, our observed value of « falls inside the central 95 per cent of the distribution. 
If this prediction is correct, then the theory has passed this test. 

This type of reasoning is transparently frequentist: the only probabilities that we can 
define are those of data with respect to theory, so we test theories by inventing a test for the 
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likelihood. Should it pass, we try to think of another test, or else get more data. It ignores 
prior probabilities, and so cannot calculate the probability of a theory given the evidence. 

As with other frequentists’ methods, we can use Bayesian probability theory to expose 
the hidden assumptions. When is typicality — defined as closeness to the likelihood peak — 
a useful discriminant between models? Consider the simple case of two theories T; and Ty 
competing to predict the value of some constant a. We calculate the likelihood distribution 
for a on each theory p(a@|TB); suppose that it is roughly Gaussian. If (a) the prior probabil- 
ities of JT; and T> are similar and, (b) if the widths of the likelihood distributions p(a@|T| B) 
and p(a@|7T2B) are similar, then the theory for which the observed value of a lies closest to 
the peak of the distribution has the greater posterior probability. 

Note that both conditions (a) and (b) are needed, and thus typicality is neither a necessary 
nor a sufficient condition for a multiverse theory to be a good theory. The problem with 
typicality is that it compares values of the likelihood at different values of a, when we 
should be comparing different theories by evaluating their likelihoods at the observed value 
of a. 

Let us be clear of the status of typicality. It is not an assumption to be accepted or 
rejected at our leisure. It is not an assumption at all. Under certain conditions, it is a useful 
rule of thumb in evaluating competing multiverse theories. Bayesianism identifies these 
conditions. 


22.6 Extension 3: Getting Metaphysical 


In this book, George Ellis has invited us to think about not only cosmology, defined as the 
physics of the universe on large scales, but also cosmologia, which asks the great questions 
of existence, meaning and purpose that are raised by physical cosmology. Nothing in our 
formalism assumes that T is a physical theory. Indeed, if there is a final, ultimate physical 
theory of nature F’, then whatever we think about that theory will have to be deeper than 
physics, so to speak. Even if all that remains is to state the definition of naturalism, that 
nothing other than the physical exists, we must acknowledge that this is a statement about 
physics, not of physics. 

Further, we want to know whether or not naturalism is true. We can treat naturalism like 
any other theory, and consider its prior probability p(N|B), and the probability of the final 
scientific laws on naturalism p(F|NB). Even if we can not calculate these quantities, they 
point to the right questions to ask. Naturalism, as a hypothesis, is what statisticians call 
non-informative — it gives us no reason to prefer any particular F. In the case of naturalism, 
this is an in principle ignorance, since by hypothesis there are no true facts that explain 
why F rather than some other final law, why any law at all, why a mathematical law, what 
“breathes fire into the equations and makes a universe for them to describe?” (Hawking, 
1988), what is existence, and so on? 

Non-informative theories have likelihoods that are at the mercy of the size of their pos- 
sibility space. For example, “the burglar guessed the 12-digit security code” gives us no 
reason to prefer any code over any other, and thus the likelihood of any particular code 
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should reflect these trillion possibilities. The only thing in our background knowledge B 
that restricts the set of possible universes is internal (mathematical) consistency. Natural- 
ism, then, is at the mercy of every possible way that concrete reality could consistently be. 
This places naturalism in an unenviable position. 

Its competitors to explain F include axiarchism (Leslie, 1989) and theism (Swinburne, 
2004), which argue that we should expect the existence of physical reality with signifi- 
cant moral value, including the moral good of embodied, free, conscious moral agents. 
Axiarchism and theism, then, bet heavily on the subset of possible laws that permit the 
existence of such life forms. Whether the fine-tuning of the laws as we know them (Le) 
for life extends to final laws F, and their relative prior probabilities, will decide whether 
any of these theories is preferable to naturalism. 


Acknowledgments 


I would like to thank all the attendees of the Philosophy of Cosmology UK/US Conference, 
2014, Tenerife for stimulating talks and discussions. Supported by a grant from the John 
Templeton Foundation. This publication was made possible through the support of a grant 
from the John Templeton Foundation. The opinions expressed in this publication are those 
of the author and do not necessarily reflect the views of the John Templeton Foundation. 


References 


Adams, F. C. (2008) Stars in other universes: stellar structure with different fundamental 
constants. Journal of Cosmology and Astroparticle Physics, 8, 010. 

Barnes, L. A. (2012) The Fine-Tuning of the Universe for Intelligent Life. Publications of 
the Astronomical Society of Australia, 29, 529. 

Barr, S. M. and Khan, A. (2007) Anthropic tuning of the weak scale and of mu/md in 
two-Higgs-doublet models. Physical Review D, 76, 045002. 

Boddy, K. K., Carroll, S. M. and Pollack, J. (2014) De Sitter Space Without Dynamical 
Quantum Fluctuations, arXiv:1405.0298. 

Cameron, E., and Pettitt, T. (2012) On the Evidence for Cosmic Variation of the Fine 
Structure Constant (I): A Parametric Bayesian Model Selection Analysis of the Quasar 
Dataset. arXiv: 1207.6223. 

Caticha, A. (2009) Quantifying Rational Belief. AJP Conf. Proc. 1193, 60. 

Eddington, A. S. (1931) The End of the World: from the Standpoint of Mathematical 
Physics, Nature 127, 3203. 

Einstein, A. (1949) Autobiographical Notes. In Schilpp, P. A., ed. Albert Einstein, 
Philosopher-Scientist. Ulinois: Open Court Publishing Company. 

Fisher, R. A. (1921) On the ‘Probable Error’ of a Coefficient of Correlation Deduced from 
a Small Sample. Metron, 1(332). 162, 164. 

Glymour, C. (1980) Theory and Evidence. Princeton: Princeton University Press. 

Griffiths, D. (2008) Introduction to Elementary Particles. New York: John Wiley & Sons. 

Hartle, J. B. and Srednicki, M. (2007) Are we typical?, Physical Review D, 75, 123523. 

Hawking, S. W. (1988), A brief history of time. From the Big Bang to Black Holes. 
Toronto: Bantam Books. 


466 Luke A. Barnes 


Jaynes, E. T. (2003) Probability Theory: The Logic of Science. Cambridge, UK: Cambridge 
University Press. 

King, J. A., Webb, J. K., Murphy, M. T., et al. (2012) Spatial variation in the fine-structure 
constant — new results from VLT/UVES. MNRAS, 422, 3370. 

Kempthorne, O. and Folks, L. (1971) Probability, Statistics, and Data Analysis. Ames, IA: 
The Iowa State University Press. 

Kolmogorov, A. N. (1933) Translated as Foundations of Probability New York: Chelsea 
Publishing Company (1950). 

Leslie, J. (1989) Universes, London, New York: Routledge. 

MacKay, D. J. C. (2003) Information Theory, Inference, and Learning Algorithms. 
Cambridge: Cambridge University Press. 

Neal, R. M. (2006) Puzzles of Anthropic Reasoning Resolved Using Full Non-indexical 
Conditioning. arXiv:math/0608592. 

Olum, K. D. (2012) Is there any coherent measure for eternal inflation?, Physical Review 
D, 86, 063509. 

Penrose, R. (2004) The Road to Reality: A Complete Guide to the Laws of the Universe. 
London: Jonathan Cape. 

Planck Collaboration, Ade, P. A. R., Aghanim, N., et al. (2015), arXiv:1502.01589. 

Swinburne, R. (2004) The Existence of God. Oxford: Oxford University Press. 

Tegmark, M. and Rees, M. J. (1998) Why Is the Cosmic Microwave Background 
Fluctuation Level 10~>? The Astrophysical Journal, 499, 526. 

Webb, J. K., King, J. A., Murphy, M. T., et al. (2011) Indications of a Spatial Variation of 
the Fine Structure Constant, Physical Review Letters, 107, 191101. 

Weinberg, S. (1993) Dreams of a Final Theory. London: Vintage. 

Whitmore, J. B. and Murphy, M. T. (2015) Impact of instrumental systematic errors on 
fine-structure constant measurements with quasar spectra. MNRAS, 447, 446. 


23 


A New Perspective on Einstein’s Philosophy 
of Cosmology 


CORMAC O’RAIFEARTAIGH 


23.1 Introduction 


It has recently been discovered that Einstein once attempted — and subsequently abandoned 
—a ‘steady-state’ model of the expanding universe (Nussbaumer, 2014a; O’Raifeartaigh, 
2014; O’Raifeartaigh ef al., 2014). An unpublished manuscript on the Albert Einstein 
Online Archive (Einstein, 1931a) demonstrates that Einstein explored the possibility of 
a universe that expands but remains essentially unchanged due to a continuous forma- 
tion of matter from empty space (Figure 23.1). Several aspects of the manuscript indicate 
that it was written in the early months of 1931, during Einstein’s first trip to Califor- 
nia, and the work therefore probably represents Einstein’s first attempt at a theoretical 
model of the cosmos in the wake of emerging evidence for an expanding universe (Nuss- 
baumer, 2014a; O’Raifeartaigh et al., 2014). It appears that Einstein abandoned the 
idea when he discovered that his steady-state model led to a null solution, as described 
below. 

Many years later, steady-state models of the expanding cosmos were independently pro- 
posed by Fred Hoyle, Hermann Bondi and Thomas Gold (Bondi and Gold, 1948; Hoyle, 
1948). The hypothesis formed a well-known alternative to ‘big bang’ cosmology for many 
years (Kragh, 1996, pp. 186-218; North, 1965, pp. 208-22; Nussbaumer and Bieri, 2009, 
pp. 161-3), although it was eventually ruled out by observations such as the distribution of 
the galaxies at different epochs and the cosmic microwave background (Kragh, 1996, pp. 
318-80, 2007, pp. 201-6; Narlikar, 1988, p. 219). While it could be argued that steady-state 
cosmologies are of little practical interest now, we find it most interesting that Einstein con- 
ducted an internal debate between steady-state and evolving models of the cosmos decades 
before a similar debate engulfed the cosmological community. In particular, the episode 
offers several new insights into Einstein’s cosmology, from his view of the role of the 
cosmological constant to his attitude to the question of cosmic origins. More generally, 
Einstein’s exploration of steady-state cosmology casts new light on his philosophical jour- 
ney from a static, bounded cosmology to the dynamic, evolving universe, and is indicative 
of a pragmatic, empiricist approach to cosmology. 
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Figure 23.1 An excerpt from the first page of Einstein’s steady-state manuscript (Einstein, 1931a). 
Reproduced by kind permission of the Hebrew University of Jerusalem. 


23.2 Historical Context 


Following the successful formulation of his general theory of relativity (Einstein, 1915, 
1916), Einstein lost little time in applying his new theory of gravity, space and time to the 
universe as a whole. A major motivation was the clarification of the conceptual founda- 
tions of general relativity, i.e. to establish “whether the relativity concept can be followed 
through to the finish, or whether it leads to contradictions’ (Einstein, 1917a). Assuming a 
cosmos that was homogeneous, isotropic and static over time,! and that a consistent theory 
of gravitation should incorporate Mach’s principle,” he found it necessary to add a new 
“cosmological constant’ term to the field equations of relativity in order to predict a uni- 
verse with a non-zero density of matter (Einstein, 1917b). This approach led Einstein to a 


! No empirical evidence for a non-static universe was known to Einstein at the time. 
2 Einstein’s view of Mach’s principle in these years was that space could not have an existence independent of 
matter and thus the spatial components of the metric tensor should vanish at infinity (Einstein, 1918a; Janssen, 


2005). 
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finite, static cosmos of spherical spatial geometry whose radius was directly related to the 
density of matter. 

That same year, the Dutch theorist Willem de Sitter noted that general relativity allowed 
another model of the cosmos, namely the case of a universe empty of matter (de Sitter, 
1917). Einstein was greatly perturbed by de Sitter’s solution as it suggested a spacetime 
metric that was independent of the matter it contained, in conflict with his understand- 
ing of Mach’s principle (Einstein, 1918b). The de Sitter model became a source of some 
confusion among theorists for some years; it was later realised that the model was not 
static (Lemaitre, 1925; Weyl, 1923). However, the solution attracted some attention in the 
1920s because it predicted that the radiation emitted by objects inserted as test particles 
into the ‘empty’ universe would be red-shifted, a prediction that chimed with emerging 
astronomical observations of the spiral nebulae. 

In 1922, the young Russian physicist Alexander Friedmann suggested that non- 
stationary solutions to the Einstein field equations should be considered in relativistic 
models of the cosmos (Friedmann, 1922). With a second paper in 1924, Friedmann 
explored almost all the main theoretical possibilities for the evolution of the cosmos 
and its geometry (Friedmann, 1924). However, Einstein did not welcome Friedmann’s 
time-varying models of the cosmos. His first reaction was that Friedmann had made a math- 
ematical error (Einstein, 1922). When Friedmann showed that the error lay in Einstein’s 
correction, Einstein duly retracted it (Einstein, 1923a); however, an unpublished draft of 
Einstein’s retraction makes it clear that he considered time-varying models of the cosmos 
unrealistic (Einstein, 1923b; Stachel, 1977; Nussbaumer and Bieri, 2009, pp. 91-92). 

Unaware of Friedmann’s analysis, the Belgian physicist Georges Lemaitre proposed an 
expanding model of the cosmos in 1927. A theoretician with significant training in astron- 
omy, Lemaitre was aware of V. M. Slipher’s observations of the redshifts of the spiral 
nebulae (Slipher, 1915, 1917) and of Edwin Hubble’s emerging measurements (Hubble, 
1925) of the vast distances to the nebulae (Farrell, 2005, p. 90; Kragh, 1996, p. 29). Inter- 
preting Slipher’s redshifts as a relativistic expansion of space, Lemaitre showed that a 
universe of expanding radius could be derived from Einstein’s field equations, and esti- 
mated a rate of cosmic expansion from average values of the velocities and distances of 
the nebulae from Slipher and Hubble, respectively. This work received very little atten- 
tion at first, probably because it was published in French in a little-known Belgian journal 
(Lemaitre, 1927). However, Lemaitre discussed the model directly with Einstein at the 
1927 Solvay conference, only to have it dismissed with the forthright comment: ‘Vos 
calculs sont corrects, mais votre physique est abominable’ (Lemaitre, 1958). 

In 1929, Edwin Hubble published the first empirical evidence of a linear relation 
between the redshifts of the spiral nebulae (now known to be extra-galactic) and their radial 
distance (Hubble, 1929). By this stage, it had also been established that the static models of 


3 Observations of the redshifts of the spiral nebulae were published by V. M. Slipher in 1915 and 1917 
(Slipher, 1915, 1917), and became widely known when they were included in a well-known book on relativity 
(Eddington, 1923). 
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Einstein and de Sitter presented problems of a theoretical nature: Einstein’s universe was 
not stable against perturbation (Eddington, 1930; Lemaitre, 1927) while de Sitter’s uni- 
verse was not static (Weyl, 1923; Lemaitre, 1925). In consequence, theorists began to take 
Lemaitre’s model seriously, and a variety of time-varying relativistic models of the cos- 
mos of the Friedmann—Lemaitre type were advanced (de Sitter, 1930a, 1930b; Heckmann, 
1931, 1932; Robertson, 1932, 1933; Eddington, 1930, 1931; Tolman, 1930a, 1930b, 1931, 
1932). 

By 1931, Einstein had accepted the dynamic universe. During a three-month sojourn at 
Caltech in Pasadena in early 1931, a trip that included discussions with the astronomers 
of Mount Wilson Observatory and with the Caltech theorist Richard Tolman,’ Einstein 
made several public statements to the effect that he viewed Hubble’s observations as likely 
evidence for a cosmic expansion.For example, the New York Times reported Einstein as 
commenting that ‘New observations by Hubble and Humason concerning the redshift of 
light in distant nebulae makes the presumptions near that the general structure of the uni- 
verse is not static’ (Associated Press, 1931). Not long afterwards, Einstein published two 
distinct dynamic models of the cosmos, the Friedmann—Einstein model of 1931 and the 
Einstein—de Sitter model of 1932 (Einstein, 1931b; Einstein and de Sitter, 1932). 

Written in April 1931, the Friedmann—Einstein model marked the first scientific publi- 
cation in which Einstein formally abandoned the static universe. Citing Hubble’s observa- 
tions, he suggested that the assumption of a static universe was no longer justified (Einstein, 
1931b). Adopting Friedmann’s 1922 analysis of a universe of time-varying radius and pos- 
itive spatial curvature, Einstein then removed the cosmological constant on the grounds 
that it was both unsatisfactory (it gave an unstable solution) and unnecessary. The result- 
ing model predicted a cosmos that would undergo an expansion followed by a contraction, 
and Einstein made use of Hubble’s observations to extract estimates for the current radius 
of the universe, the mean density of matter and the timespan of the expansion (Ein- 
stein, 1931b). Noting that the latter estimate was less than the ages of the stars estimated 
from astrophysics, Einstein attributed the problem to errors introduced by the simplifying 
assumptions of the model, notably the assumption of homogeneity.> 

In early 1932, Einstein and Willem de Sitter proposed an alternative model of the 
expanding universe, based on Otto Heckmann’s observation that a finite density of mat- 
ter in a non-static universe does not necessarily demand a curvature of space (Heckmann, 
1931). Mindful of a lack of empirical evidence for spatial curvature, Einstein and de Sitter 
set this parameter to zero (Einstein and de Sitter, 1932). With both the cosmological con- 
stant and spatial curvature removed, the resulting model described a cosmos of Euclidean 
geometry in which the rate of expansion h was related to the mean density of matter p by 
the simple relation h? = 5Kp, where « is the Einstein constant. Applying Hubble’s value 


4 An account of Einstein’s time in Pasadena can be found in Nussbaumer and Bieri (2009, pp. 144-6) and 
Bartusiak (2009, pp. 251-6). 

5 We have recently presented an analysis and first English translation of this work. We find that all of Einstein’s 
estimates contain a systematic numerical error (O’ Raifeartaigh and McCann, 2014). 
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of 500 km s~! Mpc~! for the recession rate of the galaxies, the authors calculated a value 
of 4 x 10~*8 gcm7? for the mean density of matter, a value that they found reasonably 
compatible with estimates from astronomy (Einstein and de Sitter, 1932). 

The Einstein—de Sitter model became very well known and it played a significant role 
in the development of twentieth century cosmology (Kragh, 1996, p. 35; North, 1965, 
p. 134; Nussbaumer, 2014b; Nussbaumer and Bieri, 2009, p. 152). One reason was that 
it marked an important hypothetical case in which the expansion of the universe was pre- 
cisely balanced by a critical density of matter; a cosmos of lower mass density would be of 
hyperbolic geometry and expand at an ever-increasing rate, while a cosmos of higher mass 
density would be of spherical geometry and eventually collapse. Another reason was the 
model’s great simplicity; in the absence of any empirical evidence for spatial curvature or 
a cosmological constant, there was little reason to turn to more complicated models. How- 
ever, the timespan of the expansion was not considered in the rather terse paper. We recently 
discovered a little-known paper by Einstein containing a review of the Einstein—de Sitter 
model (Einstein, 1933a; O’Raifeartaigh, et al., 2015): as in the case of the Friedmann— 
Einstein model, it is noted that the time of expansion is less than the estimated ages of the 
stars and the problem is attributed to the simplifying assumptions of the model. 


23.3 Einstein’s Steady-State Manuscript 


As pointed out in the introduction, it appears that Einstein’s steady-state manuscript was 
written in early 1931, before the Friedmann—Einstein model of April 1931. The manuscript 
(Einstein, 1931a) opens with a brief discussion of what Einstein terms the ‘cosmological 
problem’, i.e. the problem of gravitational collapse in classical and relativistic models of 
the universe: ‘It is well known that the most important fundamental difficulty that emerges 
when one asks how the stellar matter fills up space in very large dimensions is that the laws 
of gravity are not in general consistent with the hypothesis of a finite mean density of mat- 
ter. Thus, at a time when Newton’s theory of gravity was still generally accepted, Seeliger 
had already modified the Newtonian law by the introduction of a distance function that, 
for large distances r, diminishes considerably faster than 1/r?”. Noting a similar problem in 
general relativity, Einstein recalls his introduction of the cosmological constant to the field 
equations in order to allow the prediction of a universe of constant radius and non-zero 
density of matter: ‘This difficulty also arises in the general theory of relativity. However, 
I have shown in the past that this can be overcome by the introduction of the so-called 
“A-term” to the field equations. The field equations can then be written in the form (see 
Figure 23.1) 


1 
(i = 56k) — Aggy = KTiK.... (23.1) 


... At that time, I showed that these equations can be satisfied by a spherical space of 
constant radius over time, in which matter has a density p that is constant over space and 
time.’ 
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Einstein then notes that his static model was invalidated on both theoretical and observa- 
tional grounds. In the first instance, the static model was unstable, while dynamic solutions 
existed: “On the one hand, it follows from investigations based on the same equations by 
[ ] and by Tolman © that there also exist spherical solutions with a world radius P that is 
variable over time, and that my solution is not stable with respect to variations of P over 
time.’ Second, the astronomical observations of Edwin Hubble changed the playing field: 
“On the other hand, Hubbel’s [sic] exceedingly important investigations have shown that 
the extragalactic nebulae have the following two properties: 1) Within the bounds of obser- 
vational accuracy they are uniformly distributed in space, 2) They possess a Doppler effect 
proportional to their distance.’ 

Einstein then points out that the time-varying solutions of the field equations proposed by 
de Sitter and Tolman are consistent with Hubble’s observations, but predict a timespan for 
the expansion that is problematic: ‘De Sitter and Tolman have already shown that there are 
solutions to equations (1) that can account for these observations. However the difficulty 
arose that the theory unvaryingly led to a beginning in time about 10!°-10!! years ago, 
which for various reasons seemed unacceptable.’ The ‘various reasons’ in the quote is 
almost certainly a reference to the fact that the estimated timespan of dynamic models was 
not larger than the ages of stars as estimated from astrophysics. However, it is possible that 
Einstein’s difficulty also concerns the very idea of a ‘beginning in time’ for the universe. 

In the second part of the manuscript, Einstein suggests an alternative solution to the 
field equations that is also compatible with Hubble’s observations — namely, an expanding 
universe in which the density of matter does not change over time: “In what follows, I wish 
to draw attention to a solution to equation (1) that can account for Hubbel’s [sic] facts, and 
in which the density is constant over time.’ 

Assuming a metric of flat space expanding exponentially,’ Einstein derives two simul- 
taneous equations from the field equations, eliminating the cosmological constant to solve 
for the matter density (see Figure 23.2): 

‘Equations (1) yield: 


3 
ae) + Ac? =0 
4 
3 
~q? — Ac= KPC 
or 
2 
a? = > Dove (23.2) 


From his equation (4), Einstein concludes that the density of matter p remains constant 
and is related to the expansion factor a: “The density is therefore constant and determines 


6 The blank space representing theoreticians other than Tolman is puzzling as Einstein was unquestionably aware 
of the dynamic models of Friedmann and Lemaitre. 
7 It is easily shown that assumptions of homogeneity and isotropy imply this metric for a steady-state model. 
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Figure 23.2 An excerpt from the last page of Einstein’s steady-state manuscript (Einstein 193 1a), 
reproduced by kind permission of the Hebrew University of Jerusalem. Equation (4) implies a direct 
relation between the expansion coefficient a and the mean density of matter o. However, the coef- 
ficient of a? in the first of the simultaneous equations was amended from 9/4 to -3/4 on revision, a 
correction that gives the null result = 0 instead of equation (4). 


the expansion apart from its sign.’ This would be a stunning result, but it should be noted 
that equation (4) is incorrect, and arose from a numerical error in the derivation of the 
coefficient of a? in the first of the simultaneous equations. Careful study of the manuscript 
shows that Einstein later amended this coefficient from +9/4 to —3/4 (see Figure 23.2), an 
amendment that leads to the null solution o = 0 instead of equation (4). 

In the final paragraph of the manuscript, Einstein proposes a physical mechanism to 
allow the density of matter to remain constant in an expanding universe, namely the con- 
tinuous formation of matter from empty space: ‘If one considers a physically bounded 
volume, particles of matter will be continually leaving it. For the density to remain 
constant, new particles of matter must be continually formed within that volume from 
space.’ This proposal anticipates the later ‘creation field’ of Fred Hoyle in some ways 
(see Section 23.4). However, Einstein has not introduced a term representing the creation 
process into the field equations (unlike Hoyle). Instead, Einstein proposes that the cosmo- 
logical constant assigns energy to empty space that can be associated with the creation of 
matter: ‘The conservation law is preserved in that, by setting the A-term, space itself is 
not empty of energy; its validity is well-known to be guaranteed by equations (1).’ Thus, 
Einstein associates the continuous formation of matter from empty space with the cosmo- 
logical constant. In reality, the lack of a specific term representing matter creation leads to 
a universe without matter in this model. It appears that Einstein recognised this problem 
on revision of the manuscript and set the model aside without pursuing the matter further. 
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23.4 On Steady-State Models of the Universe 


The concept of a continuous creation of matter arose many times in twentieth century 
cosmology. In 1918, the American physicist William MacMillan proposed a continuous 
creation of matter from radiation in order to avoid a gradual ‘running down’ of the universe 
due to the conversion of matter into energy in stellar processes (MacMillan, 1918, 1925). 
The proposal was welcomed by Robert Millikan, who suggested that the process might 
be the origin of cosmic rays (Millikan, 1928). The idea of a continuous creation of matter 
from radiation was also briefly considered by Richard Tolman as a means of introducing 
matter into the empty de Sitter universe, although he found the idea improbable (Tolman, 
1929). 

Other physicists considered the possibility of a continuous creation of matter from empty 
space. In 1928, James Jeans speculated that matter was continuously created in the centre 
of the spiral nebulae (Jeans, 1928) and similar ideas of continuous creation were explored 
by Svante Arrhenius and Walther Nernst (Arrhenius, 1908; Nernst, 1928).8 

Following the discovery of the systematic recession of the spiral nebulae, Richard Tol- 
man suggested that a continuous annihilation of matter into radiation might be responsible 
for an expansion of space (Tolman, 1930a). While Eddington took the view that this pro- 
cess would retard expansion (Eddington, 1930), it is possible that Tolman’s paper provided 
the inspiration for Einstein’s steady-state model. As pointed out by Harry Nussbaumer, 
Einstein had many conversations with Tolman at the relevant time and Einstein’s steady- 
state manuscript bears some mathematical similarities to Tolman’s model — if not matter 
annihilation, why not matter creation? (Nussbaumer, 2014a). 

The concept of an expanding universe that remains in a steady state due to a continu- 
ous creation of matter from empty space is most strongly associated with the Cambridge 
physicists Fred Hoyle, Hermann Bondi and Thomas Gold (Bondi and Gold, 1948; Hoyle, 
1948). In the late 1940s, these physicists became concerned with well-known problems 
associated with evolving models of the cosmos. In particular, they noted that the evolving 
models predicted a timespan for expansion that was problematic and disliked Lemaitre’s 
hypothesis of a universe with a fireworks beginning (Lemaitre’s, 1931a, 1931b, 1931c). 
Another concern was philosophical in nature; if the universe was truly different in the past, 
was it not inconsistent to assume that the present laws of physics applied? In order to 
circumvent these problems, the trio explored the idea of an expanding universe that does 
not evolve over time, i.e., an expanding cosmos in which the mean density of matter is 
maintained constant by a continuous creation of matter from the vacuum (Bondi and Gold, 
1948; Hoyle, 1948). 

In the case of Bondi and Gold, the proposal of a steady-state model took as a starting 
point the ‘perfect cosmological principle’, a philosophical principle that stated that the 
universe should appear essentially the same to all observers in all locations at all times. 
This principle demanded a continuous creation of matter in order to maintain a constant 


8 A review of steady-state cosmologies in the early twentieth century can be found in Kragh (1996, pp. 143-62). 
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density of matter in the expanding universe. The resulting model bore some similarities 
to Einstein’s steady-state model, but it is difficult to compare the theories directly as the 
Bondi—Gold theory was not formulated in the framework of general relativity. On the 
other hand, Fred Hoyle constructed a steady-state model of the cosmos by means of a 
daring modification of the Einstein field equations (Hoyle, 1948; Mitton, 2005, pp. 118- 
19). Replacing Einstein’s cosmological constant with a new ‘creation-field’ term Cjx to 
represent a continuous formation of matter from the vacuum, Hoyle obtained the equation 


1 
(Ri _ 58) — Cr = Tix 


Hoyle’s creation-field term allowed for an unchanging universe but was of importance 
only on the largest scales, in the same manner as the cosmological constant. In this model, 
the expansion of space was driven by the creation of matter, and the perfect cosmological 
principle emerged as a consequence rather than a starting assumption. A more sophisticated 
formulation of the model, based on the principle of least action, was proposed in later years 
(Hoyle and Narlikar, 1962). 

As is well known, a significant debate was waged between steady-state and evolv- 
ing models of the cosmos during the 1950s and 1960s (Kragh, 1996, pp. 252-68, 2007, 
pp. 187-90; Mitton, 2005, pp. 167-96). Eventually, the steady-state universe was effec- 
tively ruled out by observation, not least by the study of the distribution of the galaxies at 
different epochs and by the discovery of the cosmic microwave background (Kragh, 1996, 
pp. 318-80, 2007, pp. 201-6; Narlikar, 1988, pp. 218-19). There is no evidence that any 
of the steady-state theorists were aware of Einstein’s attempt; indeed, it is likely that they 
would have been greatly intrigued to learn that Einstein had once considered a steady-state 
model. 


23.5 On Einstein’s Philosophy of Cosmology 


It should come as no great surprise that, when confronted with empirical evidence for an 
expanding universe, Einstein considered a steady-state or ‘stationary’ model of the expand- 
ing cosmos. Such a model fits well with his lack of interest in non-static solutions to the 
field equations in 1917, and his hostility to the dynamic models of Friedmann and Lemaitre 
when they were first proposed (see Section 23.2). Indeed, a model of the expanding cos- 
mos in which the mean density of matter remains unchanged over time seems a natural 
successor to Einstein’s static model of 1917 from a philosophical point of view. 

However, Einstein’s attempt at a steady-state model led to a null solution, and it appears 
that he abandoned the idea rather than pursue it further. (One possibility would have 
been to introduce a matter-creation term to the field equations in the manner of Hoyle; 
another to consider a fluid of negative pressure (McCrea, 1951).) Instead, Einstein turned 
to expanding models of varying matter density that could be described ‘naturally’ by the 
field equations, i.e. without the use of the cosmological constant term (Einstein, 1931b; 
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Einstein and de Sitter, 1932). It therefore seems very likely that Einstein abandoned steady- 
state cosmology on the grounds that it was more contrived than evolutionary models of the 
cosmos. 

Taken together, Einstein’s abandonment of steady-state cosmology, his removal of the 
cosmological constant term in the Friedmann—Einstein model (Einstein, 1931b), and the 
removal of spatial curvature in the Einstein—de Sitter model (Einstein and de Sitter, 1932), 
suggest a simple, pragmatic approach to cosmology. Where theorists such as Friedmann, 
Heckmann and Robertson considered all possible universes (see Section 23.2), Einstein 
sought the simplest model of the universe that could account for observation. It is worth 
asking whether this practical ‘Occam’s razor’ approach was in fact characteristic of 
Einstein’s cosmology all along, as considered below. 


23.5.1 Einstein’s Journey From the Static to the Evolving Universe 


Einstein’s journey from a static, bounded cosmology to the evolving universe is tradi- 
tionally characterised as that of a reluctant convert; a conservative Einstein, hidebound by 
philosophical prejudice until overwhelmed by irrefutable evidence (Giulini and Straumann, 
2006; Kragh, 1996, p. 26; Nussbaumer, 2014b; Nussbaumer and Bieri, 2009, pp. 92; 
Smeenk, 2014). We suggest that Einstein’s steady-state manuscript provides a useful clue 
that this narrative may be somewhat inaccurate. 

Considering first Einstein’s cosmic model of 1917, it is often asserted that the cosmo- 
logical constant was introduced to the field equations in order to predict a static rather 
than a contracting universe. In fact, it is more accurate to say that the purpose of the cos- 
mological constant was to allow the prediction of a finite density of matter in a universe 
that was assumed a priori to be static. No evidence for a dynamic universe was known at 
the time, and the notion of an expanding or contracting universe would have seemed very 
far-fetched. (Indeed, Einstein refers to the model as ‘making possible a quasi-static dis- 
tribution of matter, as required by the fact of the small velocities of the stars’ (Einstein, 
1917b).) When Friedmann explored time-varying solutions of the field equations as a 
hypothetical possibility in 1922, Einstein was one of the few who paid attention; how- 
ever, he found non-static solutions ‘suspicious’ due to a lack of supporting evidence (see 
Section 23.2). In 1927, Lemaitre’s expanding model of the universe was inspired by obser- 
vations at the cutting edge of astronomical research; Einstein’s rejection of this model 
can probably be attributed to a lack of familiarity with advances in astronomy. Lemaitre 
certainly thought so, commenting later that Einstein did not seem to be aware of recent 
astronomical measurements (Lemaitre, 1958). 

With the publication of astronomical observations suggestive of an expanding cosmos 
in 1929, Einstein lost little time in abandoning the static universe. It seems that he had 
no difficulty changing his viewpoint once such a change was warranted by the evidence. 
One is reminded of a famous comment attributed to John Maynard Keynes: “When the 
facts change, I change my mind — what do you do, Sir?’ It now seems that at this point, 
Einstein’s first guess was an expanding universe that remains essentially unchanged over 
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time — the obvious next step after his static model. However, when this attempt led to an 
empty universe, Einstein turned to evolving models instead. Noting that expanding mod- 
els did not necessarily require a cosmological constant, he removed this term (Einstein, 
1931b). When he realised that spatial curvature was also no longer a given in dynamic cos- 
mologies, this parameter was removed in turn (Einstein and de Sitter, 1932). This sequence 
of ever simpler models suggests an approach to cosmology that was not conservative 
but pragmatic — a minimalist, empirical approach to the study of the universe. Tellingly, 
Einstein did not propose any major cosmic models beyond this point; as he explained 
later, he saw little point in speculating further in the absence of empirical data on cos- 
mological parameters such as spatial curvature and the density of matter (Einstein, 1945, 
pp. 133-4). 

We note that this approach to cosmology is very typical of Einstein’s general approach 
to physics, at least in his younger years. Sometimes described as positivist, Einstein’s 
approach is more accurately described as a philosophy of logical empiricism — he embraced 
the central importance of observations in the testing of a theoretical hypothesis, at least in a 
holistic sense, but also assigned great importance to the construction of consistent theories 
from analytic principles of logic (Einstein, 1949, pp. 680-681; Frank, 1948, pp. 259-63, 
1949, pp. 271-86; Reichenbach, 1949, pp. 309-11). This is a very different approach to 
that of Compte or Mach, who suggested that the fundamental laws of physics should only 
contain concepts that could be defined by direct observations, or at least be connected to 
observation by a short chain of thought. It is also different to that of empiricists such as 
Moritz Schlick or Rudolf Carnap because it contained both positivist and metaphysical ele- 
ments.” An insight into Einstein’s philosophy of science in these years can be found in his 
1933 Herbert Spencer Lecture at Oxford: ‘Experience remains, of course, the sole criterion 
of the physical utility of a mathematical construction. But the creative principle resides in 
mathematics’ (Einstein, 1933b, 1934, p. 36). 


23.5.2 On the Cosmological Constant and Dark Energy 


Until recently, it was universally assumed that, with the emergence of the first empirical 
evidence for an expanding universe, Einstein immediately abandoned the cosmological 
constant along with the static universe (Kragh, 1996, p. 34; Nussbaumer, 2014b; Nuss- 
baumer and Bieri, 2009, p. 147; North, 1965, p. 132; Straumann, 2002). Certainly, Einstein 
made it clear on several occasions that he disliked the term, at least from the perspec- 
tive of the general theory of relativity. (For example, in 1919 he described the term as 
‘gravely detrimental to the formal beauty of the theory’ (Einstein, 1919).) However, Ein- 
stein’s steady-state manuscript demonstrates that he retained the cosmological constant 
in at least one cosmic model he attempted after Hubble’s observations, albeit for a new 
purpose. It appears that when presented with evidence for a cosmic expansion, Einstein’s 


9 See Howard (2014) for an overview of this point. 
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attraction to an unchanging universe at first outweighed his dislike of the cosmological 
constant, just as it did in 1917. 

It will not have escaped the reader’s attention that Einstein’s association of the cos- 
mological constant with an energy of space in his steady-state model is not unlike the 
current hypothesis of dark energy, at least from a philosophical standpoint. Where Einstein 
attempted to associate a continuous creation of matter with the cosmological constant, we 
now currently assume an energy for an accelerated expansion.!° More generally, it has 
often been noted that the cosmological constant term of 1917 anticipates the notion of dark 
energy in some ways. It is less well known that Einstein also considered — and dismissed — 
the possibility of a time-varying energy of space, a concept not unlike the modern hypoth- 
esis of quintessence. Within a few months of the publication of Einstein’s static model 
of 1917, Erwin Schrodinger suggested that the cosmological term could be placed on the 
right hand side of the field equations (a negative energy density term in the matter-energy 
tensor) and that the term could be time-varying (Schrédinger, 1918). Einstein’s response 
was that, if constant, placing the term in the matter-energy tensor was equivalent to his 
original formulation. If not constant, the term would necessitate undesirable speculation 
on the nature of its variation over time: “The course taken by Herr Schrédinger does not 
appear passable to me because it leads too deeply into the thicket of hypotheses’ (Einstein, 
1918c). Once again, this attitude indicates a strong dislike of complicated solutions unless 
necessitated by observation.!! 

We note that a great deal has been written over the years about Einstein’s evolving 
view of the cosmological constant. For example, the well-known Russian physicist George 
Gamow stated that Einstein once declared the term ‘my greatest blunder’ (Gamow, 1956, 
1970, p. 44), while others have cast doubt on this statement (Livio, 2013, pp. 233-41; 
Straumann, 2002). We will not enter this debate here, but simply note that Einstein soon 
dispensed with the term in his non-static cosmology. His considered view is probably best 
summed up in a footnote to his 1945 review of cosmology: ‘If Hubble’s expansion had 
been discovered at the time of the creation of the general theory of relativity, the cosmo- 
logic member would never have been introduced. It seems now so much less justified to 
introduce such a member into the field equations, since its introduction loses its sole origi- 
nal justification — that of leading to a natural solution of the cosmologic problem’ (Einstein, 
1945, p. 130). This stance should be contrasted with Einstein’s attitude to spatial curvature. 
While the Einstein—de Sitter model was based on the fact that the presence of matter in a 
dynamic universe does not automatically imply spatial curvature, the authors were careful 
not to rule it out: ‘It is possible to represent the facts without assuming a curvature of three- 
dimensional space. The curvature is, however, essentially determinable, and an increase in 
the precision of the data derived from observations will enable us in the future to fix its 
sign and to determine its value’ (Einstein and de Sitter, 1932). 


10 See Peebles and Ratra (2003) for a review of dark energy. 
11 See Harvey (2012) for a fuller discussion of this episode. 
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23.5.3 On the Question of Cosmic Origins 


To modern eyes, a striking aspect of Einstein’s steady-state manuscript is the lack of ref- 
erence to the problem of the singularity for the case of evolving models, or to the related 
question of an origin for the universe. Indeed, the manuscript is the only steady-state model 
of the expanding universe known to us that is not motivated (at least in part) by a desire 
to circumvent such problems. While Einstein is clearly conscious of the puzzle of the 
short timespan of evolving models, there is no reference to the problem of origins (see 
Section 23.3). 

One explanation might be that Einstein’s steady-state manuscript almost certainly pre- 
dated Lemaitre’s proposal of a ‘fireworks beginning’ for the universe (Lemaitre, 1931b, 
1931c). However, the issue of cosmic origins for evolving models was recognised before 
these papers were published (de Sitter, 1932; Eddington, 1930, 1931). We note instead that 
Einstein’s silence on the question is very typical of his cosmology — there is no reference 
to the problem in either of his evolving models (Einstein, 1931b; Einstein and de Sitter, 
1932) or in a contemporaneous review of relativistic cosmology (Einstein, 1933a). In later 
years, Einstein made it clear that this silence did not stem from a philosophical difficulty 
with the notion of a physical origin for the cosmos, but from doubts concerning the validity 
of relativistic models at early epochs: ‘For large densities of field and of matter, the field 
equations and even the field variables which enter into them will have no real significance. 
One may not therefore assume the validity of the equations for very high density of field 
and of matter’ (Einstein, 1945, pp. 132-3). 


23.5.4 On Einstein’s Philosophy of Relativity 


We note in passing that Einstein’s steady-state manuscript does not contain any con- 
siderations of philosophical issues associated with the theory of relativity, as opposed 
to cosmology. Reading the opening section of the work, the professional philosopher 
may be somewhat disappointed by the lack of reference to problems such as the use of 
idealised clocks and rulers in relativity,!? or the question of the geometrisation of grav- 
ity.!> This silence is once again very typical of Einstein’s cosmology; such issues are 
not discussed in any of Einstein’s static or dynamic models of the cosmos, although he 
did consider them elsewhere (Einstein, 1948). This suggests once more that Einstein’s 
approach to cosmology was essentially pragmatic; general relativity was a useful tool to 
describe the universe, but by no means the ultimate answer. As we have argued elsewhere 
(O’Raifeartaigh and McCann, 2014), it is likely that Einstein’s search for a unified field the- 
ory in these years made him very conscious of the limitations of relativistic models of the 
cosmos. 


12 See Brown (2014) for a review. 
13 A longstanding question was whether the spacetime metric of relativity was a mathematical tool to describe 
gravity, or whether gravity ‘was’ geometry (Lehmkuhl, 2014). 
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23.5.5 On Paradigm Shifts in Cosmology 


We note finally that Einstein’s steady-state manuscript does not support a view that his 
acceptance of the evolving universe occurred as an abrupt change to a new worldview. 
As described above, the model appears as an intermediate step in a long journey from 
the static universe to an expanding, evolving cosmology. Indeed, the manuscript provides 
a new piece of evidence that today’s ‘big bang’ cosmology did not emerge as an abrupt 
‘paradigm shift’ in the manner envisioned by Thomas Kuhn (Kuhn, 1962), but rather as 
a slow dawning in both theory and observation within a single paradigm, the relativistic 
universe. 

It is unfortunate that Einstein’s cosmology papers of the 1930s are not better known, 
as the pragmatic, empirical approach we have discussed above is very different to Ein- 
stein’s work on unified field theory in these years (Einstein and Mayer, 1930, 1931, 1932). 
Indeed, we find the cosmology papers quite reminiscent of the young Einstein’s approach to 
emerging phenomena (Einstein, 1905a, 1905b, 1905c). One wonders whether the familiar 
narrative that Einstein became more and more attached to a formal mathematical approach 
to physics in his later years is entirely accurate. Could it be that Einstein’s philosophical 
approach to science did not truly change but that the intense level of mathematical abstrac- 
tion one associates with Einstein’s later work was simply a facet of the great technical 
challenge posed by unified field theory? 


23.6 Conclusions 


Einstein’s attempt at a steady-state model was abandoned before publication but it offers 
many insights into his philosophy of cosmology. His hypothesis of a universe of expanding 
radius and constant matter density is very different to his static model of 1917 or his evolv- 
ing models of 1931 and 1932, and anticipates in some ways the well-known steady-state 
cosmology of Hoyle, Bondi and Gold. The model was almost certainly written in early 
1931, when Einstein first learnt of observational evidence for a cosmic expansion, but was 
quickly abandoned when it led to a null solution. The steady-state manuscript is neverthe- 
less of interest because it offers new evidence that Einstein’s philosophical journey from 
a static, bounded cosmology to the dynamic, evolving universe was that of a pragmatic 
empiricist, rather than a reluctant conservative. 

We note finally that Einstein’s steady-state model finds an echo in current theories of 
cosmic inflation. In particular, the de Sitter metric of flat, exponentially expanding space 
used in inflationary models!‘ recalls the steady-state models of Einstein and Hoyle. Indeed, 
many scholars have noted that inflationary models are effectively steady-state cosmologies 
over an extremely limited timespan (Barrow, 2005; Hoyle, 1994, p. 271; Narlikar, 1988, 
pp. 223-5, 2005). Furthermore it has been suggested (Linde 1986a, 1986b; Vilenkin, 1983) 


14 See Liddle (1999) for a review of inflationary cosmology. 
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that the inflationary process inevitably creates the conditions for further inflation in a never- 
ending cycle. This concept of ‘eternal inflation’ raises the possibility that the observed, 
evolving universe is a local anomaly in a global ensemble that is in a steady state (Barrow, 
2005), a scenario that is not dissimilar to Hoyle’s later proposal of a steady-state universe 
permeated with local ‘little bangs’ (Hoyle and Narlikar, 1966; Hoyle et al., 1993; Narlikar, 
2005). Thus it can be concluded that, like the cosmological constant, the concept of the 
steady-state universe is proving hard to banish from modern cosmology. 
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The Nature of the Past Hypothesis 


DAVID WALLACE 


There is a narrative about the nature of asymmetry in time which can be caricatured like 
this. Firstly, there is our fundamental physics — if we are being really careful about it, this is 
the standard model or our preferred post-standard-model physics; in reality, in the practical 
cases we think about, it is more likely to be ordinary quantum mechanics or maybe even 
classical Hamiltonian dynamics. In any case, it is supposed to be the physics of the micro- 
constituents of the world. And it is time-reversal invariant and shows no particular direction 
of time. 

And secondly there is the observed world, which is full of various kinds of observed 
asymmetries: dynamical asymmetries, entropic asymmetries, causal, psychological asym- 
metries, and so on. And the general way we set the problem up is as a contradiction between 
what our physics says, which is that the world is time-reversal invariant, and what we see 
around us, which is not time-reversal invariant. 

I want to suggest the advantages of a slightly more nuanced way of thinking about the 
problem. It is really not the case that all of physics, or even most of physics, or even very 
much of physics, frankly, is “fundamental” physics. In the middle — between the fundamen- 
tal physics at the bottom, and the directly observed macro-world at the top — we have a huge 
range of what we might call higher-level (or “emergent”) dynamical systems, governed by 
higher-level dynamical equations. I am thinking of the equations of fluid dynamics; I am 
thinking of the Boltzmann equation that governs dilute gases and many similar systems; I 
am thinking of the Langevin equation and the Fokker—Planck equation that govern Brow- 
nian motion; I am thinking of the equations of radioactive decay. (In principle I am also 
thinking about all the various equations and rules of the higher sciences, but we can confine 
our attention here just to the panoply of different systems and different equations that we 
study in physics.) 

Actual physics is a plethora of different dynamical systems governed by different sorts 
of laws. And if you look at those higher-level laws you find — not universally, but very 
frequently — that they have a whole range of properties which are not shared by the 
fundamental physics. I want to focus on two particular properties like this. 
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Firstly, they tend to have a lack of time symmetry. It is worth pausing on what that 
means for a second, because in one sense the standard model (and its plausible successors) 
do not have time symmetry either: they are symmetric under CPT but not under T alone. 
But higher-level theories are time-reversal-non-invariant in a more important way that the 
standard model does not share: they are generally irreversible, which is to say that their 
dynamical maps are many-to-one, either in the literal sense that they take different initial 
states to the same later state or in the sense that they take different initial states to final states 
that get closer and closer together over time, so that to any given grain of resolution they 
might as well be taken as the same final state. And frequently these higher-level theories 
are not just irreversible in general, but have attractors: particular points in their state space 
such that all the states which share the same conserved quantities as that attractor end up 
at that attractor, according to the equations of those theories. 

And secondly — somewhat less importantly for my purposes, but not irrelevantly — the 
dynamical equations of these higher-level theories tend to be probabilistic. Which is to say: 
sometimes they are stochastic equations; sometimes they are equations for the evolution 
of classical probability distributions; quantum-mechanically they tend to be equations for 
the evolution of mixed states. In general we tend to recover determinism for these kinds 
of theories only in a law-of-large-numbers sense, and only when we are talking about 
systems with a lot of degrees of freedom. Something like the Boltzmann equation will do 
as an example; characteristically [1] that is set out probabilistically as an evolution equation 
for a one-particle marginal (whether that is a density operator, as in quantum theory, or a 
classical probability distribution), but the multi-body correlations are weak enough and the 
particle numbers are large enough that at the end of the process we can treat the predictions 
as deterministic. That is not always how it goes — Boltzmann famously derived the classical 
Boltzmann equation directly without going through a probability route [2] — but it tends to 
be the general pattern. 

Pause for a second — and set aside for a second how any of this higher-level physics 
is linked to the underlying fundamental physics — and ask just how we would go about 
making claims about the past and the future of a system governed by these kinds of laws. 
In a theory governed by reversible dynamical laws, there is going to be a fairly obvious 
symmetry in how we do it. How do you use the theory to learn about the future? You look 
at what the present state is, turn the handle of the dynamics, and out comes a prediction 
about the future state that can be checked against experiment. How do you use the theory 
to learn about the past? Pretty much the same. You plug in the present state, you run the 
dynamical equations backwards, that tells you what the past state is supposed to be, and 
then you compare it with what it actually was. And there is a reason we call this by the 
neologism “retrodiction”’: it is to suggest that we are doing the same kind of thing as a 
prediction about the future. 

In a theory governed by irreversible dynamical laws, prediction is going to be a similar 
kind of game: you plug in the present state of the world, evolve it forward in time, and out 
comes a statement — possibly probabilistic — about the future state of the world. We are 
not in a position to retrodict in the same way, because in an irreversible theory, the present 
state is typically not going to determine a past state. That could be because the dynamics 
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is deterministic but irreversible, so that many past states are compatible with the present 
state; it could also be because the dynamics is probabilistic and just is not really in the 
business of telling us anything about what the past looked like. 

What we do in practice is a kind of “guess and check”. At the crudest level, we take a 
guess as to what looks like a reasonable past state, we evolve it forward and see what the 
state would be now, we compare with what it actually is, and we iterate that process. If 
we want to be slightly more careful, more systematic, more formal about that, we could 
put something like a Bayesian prior distribution over our collective initial states, use that 
Bayesian prior to work out a probability distribution over possible present states, condi- 
tionalize on the actual present state and see where that leads us. However this process is 
made precise, let us call it historical inference, and distinguish it sharply from retrodiction. 
It is our normal means by which we learn about the past in these kinds of theories. 

It is worth reminding ourselves of the cases where we actually use retrodiction, just to 
see the contrast with historical inference. If we want to work out the dates of some eclipse 
that is mentioned in some fifth century BC history we really do carry out pretty much 
the time-reverse of the calculations that we use to work out the dates of an eclipse in the 
next century — that is, we really do just run the equations of the solar system backwards. 
But this is very much the exception which proves the rule, and even then it only applies 
approximately. 

In a world which was, hypothetically, really governed by these kind of irreversible (and 
often probabilistic) laws, it would not be particularly mysterious that we had a whole 
bunch of psychological asymmetries, causal asymmetries, entropic asymmetries, and so 
forth. If our underlying physics contained a whole bunch of irreversibility and violation of 
time-reversal invariance, we would expect that the observed world would also have those 
features. 

All this suggests a different way of setting up the problem of asymmetry in time from 
the way it is normally set up, the way we started with. In that approach, typically we say: 
how do we reconcile the fundamental-physics level directly with the observed world? My 
suggestion, as a friendly amendment to this way of thinking, is that there are advantages 
in keeping the discussion internal to the equations of physics, and asking: how can we 
reconcile the bottom level, the fundamental physics level, with the emergent higher levels, 
the levels of irreversible dynamical equations? If we can sort this out, the remaining step — 
reconciling that higher level with the observed-macro-world asymmetries — looks rather 
tractable. 

Why do I make this suggestion? There are three reasons. The first is just division of 
labour. It is quite a job to say: ok, how am I to reconcile the microscopic physics of the 
world with the fact that Iremember the past but not the future? There are so many different 
layers, and so many different bits of science, filtering into that kind of story that trying 
to do the whole thing all in one go inevitably involves radically simplifying models, and 
dynamical assumptions that we are not remotely in a position to check. On the other hand, 
if we can outsource the problem of understanding why we remember the past but not 
the future to those with relevant expertise in memory and cognition and evolution and so 
forth while handing them on a plate a whole bunch of underlying physics that has time 
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irreversibility in it already, then the task looks a bit more tractable than trying to do the 
whole thing ourselves. 

The second reason is that we need to understand the fundamental/emergent relation any- 
way. An account of why we remember the past and not the future, or an account of why 
ice melts in qualitative terms, that does not also tell us quantitatively how it is that we 
can have the Langevin equation or something similar holding compatibly with our micro 
physics has not finished the job, because that equation is demonstrably correct for some 
physical systems and we need to understand why. Conversely, if we can account for the 
latter, if we can get a grip on how to reconcile higher-level irreversible dynamical systems 
with bottom-level reversible dynamical systems, we shall get a lot of the rest more or less 
for free. 

The third reason, and the one I want to dwell on, is: there is something funny about 
saying it is a deep mystery how we can reconcile time reversible microphysics with time 
irreversible higher-level physics. In one sense it is a mystery: we can prove a whole bunch 
of formal incompatibilities and show that no time-asymmetric physics can be derived from 
a time-symmetric starting point. But on the other hand, mostly we do not get our higher- 
level dynamical physics purely from phenomenology, purely from experiment. To a very 
large extent we actually do derive it from the microphysics — or perhaps, to avoid beg- 
ging the question, we at least construct it from the microphysics. We have a collection 
of thoroughly used and highly successful techniques for starting with micro-level physics 
and getting out equations governing the higher-level physics. In fact the great bulk of what 
we call the evidence for our low-level physics is actually mediated through this kind of 
process. And we do not just get the qualitative form of the higher-level equations out, we 
actually get the coefficients. 

For instance, think about the decay rates of particles. Those are governed by a time- 
irreversible decay equation, but we get that equation out from quantum field theory. And 
we get out the decay coefficient while doing it. So at some level we clearly know how to do 
this, or at least we have a trick that very reliably works, and which works, as philosophers 
would say, projectably, in two senses. It allows us to work out dynamical equations which 
seem to be laws for the systems in question, in that they apply to those systems wherever 
they are in space and time. And the fact that we can do this also turns out to be projectable: 
if we take a novel physical system where we have not yet tried to work out what the macro- 
dynamics are, and if we use these kinds of techniques, we tend to get out empirically correct 
laws. 

So that suggests that if we want to understand where the time asymmetry comes into 
physics, we ought to be looking at what ingredient we are actually putting in when we 
construct the Langevin equation, or the radioactive decay equation, or any of these higher- 
level equations. And indeed, this is a sanity check on extant claims of where the asymmetry 
is. If, for instance, it is claimed that the origin of temporal asymmetry is a specific low 
entropy boundary condition for the Universe, we ought to be able to see how it is that this 
low entropy boundary condition, perhaps indirectly, underpins whatever is actually being 


490 David Wallace 


put in to our derivational process to get out the Langevin equation and the like from our 
microphysics. 

And that moves me on to the second half of what I want to talk about: let us actually 
have a look in a little bit more detail at these derivations, and see what is going on. For 
technical details here, see Wallace [3] chapter 9 and references therein. 

Iam going to be very qualitative, and as general as I can manage, but of course the activ- 
ity of deriving higher-level dynamical equations from lower-level dynamical equations is 
enormously wide and varied — indeed, in a sense it is the great bulk of physics — and I only 
know small corners of it. So let us stipulate: I am talking about a certain subclass of such 
processes. That subclass is not empty; conjecturally I would say that that subclass contains 
an awful lot of what we do, but it is not particularly my brief to say that it covers every 
low-level/high-level derivation known. 

Here is a generic model of how I am going to think about things. To a large extent what 
we are doing when we construct higher-level physics is some kind of coarse-graining. At 
the kinematic level, there is a state space Sy, of the low-level theory, and there is a state- 
space Sy of the high-level theory, and there is what we might call a reduction map that 
takes us from points of Sp to points of Sy. That map is typically going to be many-to- 
one: it is going to take a whole group of low-level states and associate them with a single 
high-level state. 

The paradigmatic example of this is something like the way we do fluid or gas dynamics. 
Here the low-level theory, at least classically, is going to be the Hamiltonian dynamics of 
107° point particles interacting under some force laws, and the states in the high-level 
theory are going to be specified by giving the pressure, density and velocity of the gas 
averaged over, say, one-cubic-micron cells. This means that Sy is still a large dimensional 
state space, but it is a lot smaller than the low-level space S_: A whole bunch of different 
particle arrangements are going to be associated to single fluid states. And it is going to 
follow, of course, just from that reduction map, that if I take a trajectory in Sy, defined by 
the low-level dynamics, that trajectory is going to be mapped to some path in Sy. 

If we move outside the particular case I have just discussed and ask generally how we 
might try to set up high-level to low-level correspondences in physics, there is a slightly 
unfortunate tendency at this point to go into pragmatics and epistemology. Sometimes you 
will hear people say that what we are doing when we go from the low-level to the high- 
level theory is: we are keeping only those features of the system that we are interested in 
and discarding those that do not interest us. For instance, maybe we are not interested in 
the precise positions of the gas: we just want to know its bulk state. 

That is not going to work. I am not actually very interested in the cubic micron of gas 
over there in that corner; I am not really very interested in the gas in this room at all. I could 
perfectly happily go through the rest of my life knowing nothing about it, and I imagine so 
could the rest of you. But it is still true that its dynamics is governed by the equations of 
fluid dynamics; that generality is no less true because I do not care about it. 

Conversely, I am actually quite interested in what the stock market looks like tomorrow. I 
would love to be able to do a dynamical reduction where I course-grained over everything 
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except the leading numbers in the price index tomorrow and kept those. But, however 
fascinated I might be about that, I cannot do it, not in a way that is predictively useful. What 
is going on here is that we are not really asking: when is the high-level theory defined by 
the degrees of freedom we are interested in? We are actually asking: when is the high-level 
theory defined by degrees of freedom for which we can write down autonomous dynamical 
rules? 

What do I mean by that? The low-level theory has dynamics, which determine low-level 
trajectories, and each low-level trajectory determines a high-level trajectory, but there is no 
a priori guarantee that the low-level dynamics determines a high-level dynamics. There is 
no guarantee, for instance, that there cannot be two trajectories in S,, whose images in Sq 
are identical up till some moment of time and then diverse; which is to say that there is no 
guarantee that in moving from the low level to the high level I have not discarded some 
information which is actually necessary to predict the future evolution of the high-level 
states. 

So what we actually want are reduction maps that do not have this feature: that actually 
do generate a high-level dynamics from a low-level dynamics. In this situation we have 
something we can call a dynamical reduction process. Another way to put it is that we can 
find a dynamics on Sy such that evolving a state in Sy, forward in time under the lower- 
level dynamics and then mapping it to Sy, or mapping it to Sy immediately and evolving it 
under the dynamics on Sy, gives the same results. In mathematical terminology, dynamical 
evolution and the coarse-graining reduction map commute. 

There does not need to be any such high-level dynamics, but often there is. And in 
particular, often that high-level dynamics is pretty robust against the fine-grained details of 
how we define the reduction map. (I talked about my gas on cubic micron cells, but clearly 
I could have chosen ten cubic microns or half a cubic micron.) 

The existence of these high-level dynamical laws is, or ought to be, kind of surprising. 
After all, step back from the fact that we know it works empirically, and ask how is it that 
we can find equations for just a small number of degrees of freedom in a huge system that 
are autonomous, that can be studied dynamically while discarding the information at the 
other degrees of freedom? As with the stock market case, it is not generally true that that 
can be done: generally we cannot just pick a subset of degrees of freedom and work out 
what they do, given that they are dynamically coupled to all the other degrees of freedom. 

So why do high-level laws ever exist? Very broadly, we can see two reasons. Firstly, 
sometimes it happens because there really is a dynamical decoupling of some degrees of 
freedom from others, and generally that happens when we have got a symmetry of some 
kind in play. So it really is the case that, under appropriate approximations, the centre 
of mass degree of freedom of the Earth decouples from the zillions of other degrees of 
freedom of the Earth. And so I really can write down an evolution equation for the centre 
of mass of the Earth treated as a point with just three degrees of freedom. Even that is 
not perfect: if the gravitational field varies quickly on scales comparable to the length 
of the Earth, I am going to have trouble, but to a first approximation I can do it. And I 
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can do it basically because the symmetry structure of the dynamics lets me decouple the 
centre-of-mass degrees of freedom from the rest. 

Much more commonly, as for instance in the gas, what is going on is not that there is 
complete decoupling of this kind: it is that the residual degrees of freedom, those discarded 
when we apply a coarse-graining reduction map, are very large in number and very random 
in the fine details of their dynamics, and each one of them is contributing only to a very 
small extent to the overall dynamics. So we can do a statistical trick: rather than keep track 
of them in bulk we can just keep track of their statistical averages; we can sum all their 
contributions up and treat the whole thing as a sort of generalized noise term. (And you 
see this in some of the ways one actually derives these kinds of equations in detail.) Using 
that method is implicitly taking a bet that actually those fine-grain details do not matter 
and that we really can treat them as a sort of averaged-over noise. 

That sounds great, but there is a sense in which we know that it cannot really be true. 
And the sense is the following: if it really were the case that some reduction map from low- 
level to high-level state space was compatible with a completely accurate, robust reduced 
dynamics for the high-level theory, then if the dynamics of the low-level theory is time- 
reversal invariant the dynamics of the high-level theory had better be time-reversal invariant 
as well. And it is not. So something went wrong. 

And if we look at what went wrong — if we look at the mathematics of what we are doing 
here — what is going wrong is that it is not true that any distribution of the microscopic 
degrees of freedom with such and such average will behave in such and such a way. There 
will be ways of tuning and setting up the microscopic degrees of freedom, so that they are 
aligned in just the right way to break our assumptions about how the dynamics is going to 
work. 

Let me give an example here — partly to help see how general this discussion is; it is 
not a statistical mechanical example in the usual sense. Think about radioactive decay. We 
have simple higher-level dynamics for decay: the probability of an unstable particle not 
decaying is exponential in time, and there are also terms for particles being kicked into 
undecayed states by absorbing the sort of particles that comprise the decay products. 

Now ask how all this works quantum-mechanically. If I take a particle on my desk which 
has in fact not decayed and evolve it forward in time, it will evolve into a superposition 
of the undecayed particle and a whole bunch of decay products at different times. If I try 
evolving it backwards under Schrédinger’s equation, I will erroneously predict the time 
reverse of decay: after all, the equation has a time-reversal symmetry. How do I reconcile 
that with the fact that the particle has just been sitting on my desk undecayed? 

The answer is that the full quantum state at the present time contains not just a term 
describing the undecayed particle but all the terms corresponding to the particle hav- 
ing decayed at various times in the past. And properly running the dynamics backwards 
means allowing for all of these other terms — which all have just the right mod-squared- 
amplitudes, and just the right phases, that they continually cancel out the decay terms that 
are produced from the backward evolution of the undecayed term. It is because of that 
set-up being aligned just right — in Everettian terms, [3, 4] it is because of all the branches 
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going backwards in time interfering in just the right way — that I get the wrong answer if I 
try to retrodict using the normal radioactive decay equations. 

And for the same reason, if I take the time reverse of the current state (including all the 
decay terms as well as the undecayed term corresponding to our current experience), that 
state is forward time evolution will not match the normal prediction of radioactive decay. 
More or less any old generic state of the way the fields could be would not do this but 
this very carefully prepared one, prepared by time evolving the system forward and then 
time-reversing, is set up in just the right way to do it. 

This state is a counter-example state, a state for which our methods of statistical aver- 
aging fail to generate the right higher-level evolution. And in general, there will exist 
counter-example states whenever we try to derive irreversible higher-level dynamics from 
reversible lower-level dynamics. 

What can we say in general about the counter-example states? Just that they are going 
to be very delicately and carefully structured. It is tempting to say that they are very low 
probability states, in some sense, but I want to warn against that, because we are mostly 
here talking about theories that are already formulated as probability theories or theories 
of mixed states, so our space of states here is a space of probability distributions or mixed 
states anyway. And talk of “low-probability probability distributions” is at least prima facie 
not well defined. So it is not that the counter-example states are low probability, exactly, 
but that heuristically, and in some cases demonstrably, they are going to be very delicate, 
complicated, carefully specified states. (Indeed, the only way we really know to write down 
any such states is the one we used in the case of radiation: take a simple state, evolve it 
forwards through time, and then time-reverse it.) 

So then the question of why high-level dynamics of this kind in general works (and in 
particular, the question of why non-equilibrium statistical mechanics works) is going to be 
the question of why we are allowed to assume that initial state is not like that, is not one 
of the counter-example states. And of course if we take any given physical system — some 
small, mundane system, like the glass of water on my desk - it is not surprising that its 
initial state is not like that: its initial state has been prepared by a whole bunch of other 
dynamical processes, so unless those dynamical processes are really, really carefully set 
up in a certain way, or unless the initial state that we feed into those processes was very 
carefully set up in a certain way, we would be very surprised to find that the initial state 
of our mundane system was one of the very delicately assembled states which count as the 
exceptions to the usual reduction rule. 

What we have done there is push back the requirement that the state is not careful and 
special in this sense back to an earlier system: the system which led dynamically to our 
mundane system having the initial state it had. And of course - and this is my justification in 
talking about this material in a conference on cosmology - if you keep pushing these things 
back and back, following from our initial system, to the system that generated it, to the 
system that generated that system, ... where you are going to get to is a requirement that 
the initial state of the Universe as a whole is not one of these very special counter-example 
states. 
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To sum up: what we seem to get out if we look at the actual structure of reduction and 
emergence in physics is a claim about the initial state of the universe — something that 
is hardly novel in discussions of time asymmetry - but a claim that is slightly different 
from the usual claims. Here I have in mind the neo-Boltzmannian approach espoused by, 
for example, Albert [5], Penrose [6], Carroll [7] or Goldstein [8]. We are not requiring 
anything of the macro-state of the early universe: in particular, we are not requiring it to 
have low entropy. Instead, we are requiring something about its micro structure: we are 
requiring it not to be set up in the kind of delicately correlated ways that invalidate our 
general averaging moves. 

From this perspective, far from introducing a condition which makes the initial uni- 
verse very special (the usual ways “past-hypothesis” claims about the early Universe are 
phrased), we are asking that it should not be very special. We need to be a bit careful, 
though, because of course if we suppose that the universe is closed and has a final state as 
well as an initial state, and if we take this “not very special” initial state and run it forward, 
then - precisely because this licenses us to apply higher-level, emergent dynamics — the 
final state is going to be really, really special. It is going to have built into it just the right 
correlations such that if time-reversed and run backwards it will not be governed by the 
macro-level dynamics, but rather by their time reverse. So we have not somehow dissolved 
away the need for an asymmetric assumption. But that assumption looks a bit different, 
conceptually and mathematically, from what we are used to in these discussions: we are 
not supposing that the state at one end of time is particularly low-entropy, but that it is 
relatively simple, relatively free of complicated and delicate correlative structure. 

What should we say about the fact that the early universe has a very low entropy? Well, 
it certainly seems to need explanation, but it does not seem to be very different from other 
facts about the initial condition of the universe. How do we learn about the fact that the 
early universe has low entropy? By historical inference: we take our possible guesses as 
to what the initial state is, we evolve them forward and we compare with what we have. 
And since our macro-level dynamical theories are entropy-increasing, given that they are 
correct, the fact that the early universe has a low entropy compared to the present day 
universe is a claim that is not very difficult to get out historically. 

So: my suggested way of proceeding gives us, in David Albert’s terms [5], a different 
sort of transcendental assumption that we need to make in order that our physics works. 
But it is not a transcendental assumption of the kind we can actually empirically check in 
cosmology; it is a transcendental assumption about the fine-grained micro-level delicate 
structure of the early universe. And it is basically the assumption that it has not got much 
of it. 


24.1 Discussion 


David Albert: Thank you first as usual for a beautiful talk. I guess I have two small com- 
ments: one about the issue of division of labour that you were talking about. Here is the 
thing: presumably what we are interested in showing, among the asymmetries between past 
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and future. For example, we are interested in showing not merely that epistemic access to 
the past is different from epistemic access to the future for human beings or for mammals, 
or for terrestrial organisms, or biological organisms in general. It is something much more 
general and much more fundamental than that. So that in that sense there is a sort of nat- 
ural expectation that it should have some fairly direct link to the fundamental physics. If 
we were to encounter Martians that had time-reversed epistemic access relative to us we 
would be flabbergasted. So actually there is a sense in which it does not feel like a job for 
neurologists or psychologists or experts in human cognition. There is a sense in which it 
feels like a job for the fundamental physics. That is one thing. 

A related point: The people who talk about features of the initial macro-state of the 
universe are trying to do a job slightly different from the one you describe here. That is, 
they are not just trying to justify the macro-dynamics or to explain the macro-dynamics. 
What such people are usually trying to do is something in a way more ambitious. I mean, 
maybe that is a foolhardy task, but they are trying to both justify the macro-dynamics 
and sort of systematize the whole process of inference towards past and future. So, for 
example, on the way of reasoning that you are describing, there is a bunch of earlier states 
that historical inference could lead to. One is the state we think pertained five minutes ago, 
the macro-state of the world. Another is the macro-state of the world we think pertained 
ten minutes ago. And so on and so forth, all the way back to the Big Bang. So I think I 
completely agree with you that if the task at hand is just to explain the macro-dynamics 
you need much less than that. If the task you are looking to do is more ambitious than that, 
you may need more. 

David Wallace: It is absolutely right of course that we want a very general explanation 
of the epistemic asymmetries, and I agree with you it is a job for physics, but it is not 
necessarily a job for fundamental physics. We live in a world where the degrees of freedom 
and the macroscopic functionals are governed by time-asymmetric dynamical laws. In a 
situation like that it is unsurprising to find these kinds of general epistemic asymmetries; 
there is further work to try and understand them and get them out but it is not a mystery in 
that sense. 

Having said that, in some sense this is spoils to the victor: if we can give that explanation 
directly in terms of fundamental physics, why not? 

As to the other point: I do not disagree there at all. But it is sometimes said that the past 
hypothesis is specifically something we need to explain the fact that statistical mechanics 
techniques work. My claim is: that is not really what is doing the work. Put in your frame- 
work, what is actually doing all the work is the probability distribution over the initial 
macro-condition, not the choice of macro-condition. 

Carlo Rovelli: If I had to cover both talks, David [Wallace] and David [Albert], my 
comment would be that according to my own understanding everything that has been said 
is exactly right, as far as I understand. (Which, of course, might be due to the fact that I am 
a theoretical physicist, I do not get saddled with philosophy.) But it seems to me: did not 
we know that; is that not the way we understand things; have you not just put clearly what 
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was understood? And in fact was not it all understood by Boltzmann, especially in what he 
wrote after he got all the criticisms to his H theorem? 

I want to make a point related to that. First, let me say that I talked about the problem of 
the special initial condition in my first talk this morning; perhaps my talk should have been 
after yours because the point of departure of my talk was the conclusion that you arrived 
at. But what I want to say is that after getting the criticism for his H theorem Boltzmann got 
to this point, which I think is not often appreciated: he was thinking in probabilistic terms, 
he was thinking in terms of equilibrium fluctuations, and what he proved — or rather, what 
he stated was the case, and in fact what has been proved recently by people in statistical 
theory, is the following. That if you take a statistical system and you look at the fluctuations 
of its entropy (which of course is not going to be maximal, it is going to fluctuate) you can 
ask why, if I am at a certain point away from the maximum, I find that it goes up in one 
direction in time and down in one direction in time, while my theory tells me that it goes 
down in both directions. And here is an answer to that: it is a beautiful answer. Given a 
value the most probable situation is that it is a peak of a fluctuation. So this makes it clear 
that his own result is valid not because it breaks time-reversal invariance; it confirms time- 
reversal invariance and it shows that given a situation in which you are out of equilibrium 
the most likely solution is going to be in both directions. And of course (in the context 
of our observations) the only way to make sense of that is to go back to the origin of the 
universe, so making it a cosmological posit. The theorem has been proved quite recently in 
statistical mechanics rigorously; Boltzmann just guessed it. 

David Wallace: I want to pick up on the first point. The job of the philosopher of physics 
is to tell people the physics they know and then say that it is profound. I am actually almost 
serious about that, not just being self-deprecating: a lot of the point of this kind of work 
is to take things that are tacitly grasped by people, and understand clearly and consistently 
and explicitly how those things should be understood. It almost shades in to pedagogy at 
some level. 

But I do think there is a degree of mismatch between what is said in general and verbal 
terms about why systems equilibrate, and what you actually find if you go down into the 
weeds and look at the way we construct and derive our detailed quantitative understanding 
of equilibration. I think that is of some interest, and in particular I think it is somewhat 
striking that the role of low entropy assumptions is more indirect and much less clear than 
it can seem in the general discussions. 

Ihave been interested in this stuff partly because I thought: okay, I buy the general claim 
that the way to understand why statistical mechanics works is because the initial universe 
had a low entropy, so let us actually have a look at how those equations work and see where 
the low entropy assumption in particular is coming in. And then it turned out to be more 
complicated than that. 

Simon Saunders: I think the value of your talk, David, for me anyway, is that it seems to 
undercut a sense of mystery or some sense of strangeness or weirdness about how special 
the initial condition has to be. It seems — and Penrose has made a lot of this [6] — that by 
virtue of the fact that the final state of the universe, given black holes and eventually black 
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hole evaporation, has a fabulously higher entropy than it has now, one gets to the picture 
that the early state of the universe had to be amazingly, precisely fine-tuned so as to have 
low entropy. And your punchline, in a way is: no, that it is not that special. 

Don Page: That is very controversial, I did not think you [Wallace] said that. Did you 
really say that? 

Simon Saunders: Let me carry on, because I am presenting a gloss on what David is 
saying and maybe David will tell me I have got the gloss wrong. My gloss is that the 
initial state of the universe, far from being special, precisely is not at the microscopic level 
carefully calibrated so as to have the sorts of coincidental relations that could bring together 
what would look to us like time-reversed macroscopic phenomenology. 

And I just want to push that, in that if you imagine Penrose, or some other physicist or 
mathematician, doing some other calculation which shows how yet more extraordinarily 
special the initial state seems to have to be, because of anthropic considerations, perhaps 
the right way to think of that is that what they have really shown is how extraordinarily 
high the final entropy of the universe can be, how extraordinarily large entropy can grow 
through physical processes. And perhaps black hole thermodynamics is an example of that, 
that prior to the understanding of black holes we did not realize how large the entropy could 
get to be. 

So I wonder if you would embrace that way of thinking about it. I think the comeback 
on that would be to say, well, okay, the initial state of the universe has to be amazingly 
non-special at the microscopic level, it must not have all of these careful, finely tweaked 
relations among the, whatever it is, particles and so forth, and you might say that what we 
are really learning is: you think you have got a non-special state because it does not have 
those finely-tuned correlations, well, guess what, it has got to be even more non-special 
than we thought it had to be. 

Just to summarize then, suppose we learn of some new mechanism as we did in black 

hole thermodynamics, such that the final state of the universe could be even more extraor- 
dinarily high entropy, is that the right way to be surprised? Or is it Penrose’s way? No, 
the surprise is to find how extraordinarily fine tuned the state of the universe has to be so 
as to be low entropy. Or, the third way that I am offering as well: is it that we learn how 
extraordinarily non-special the initial state has to be? 
David Wallace: I think this partly says what we mean by ‘special’. So (and I take this to 
be what is worrying Don), I am obviously not claiming that the initial macro-state does not 
occupy an extremely small region of phase space. Clearly it does. What I am claiming is 
that this fact about the macro-state is not the thing which is doing the explanatory work of 
saying why the law-like regularities of high-level statistical mechanics hold. 

Now of course if we imagine a fictional world that (unlike in real cosmology) actu- 
ally had an equilibrium state — a box-like world of some kind — then if it is initial state 
was unspecial in the macro sense it would be at equilibrium. The claim that the macro- 
dynamical laws held would then be boring because they just say: you are at the attractor, 
stay there. But nonetheless the thing that is doing the work in explaining why those (boring) 
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law-like regularities hold is the non-specialness of the micro-structure within that macro- 
state. And you could perfectly well imagine a universe that was actually at equilibrium 
but whose micro-state was extremely special in just the right way that it went away from 
equilibrium. (Take a system that has just equilibrated, and time-reverse its micro-state.) 

On the broader question of the specialness of the initial macro state: I have to confess 
I am not entirely clear, and I am less clear the more I think about it, quite what it is that 
concerns us about it. Granted, the theory that God created the universe by picking a point 
uniformly at random with respect to the Liouville measure on phase space is falsified by 
the data. But that was not a very plausible creation myth in the first place. 

I think what is going on is something like this. In general, a really good way of studying 
a large system in a given macro-state is to assume its micro-state is chosen at random with 
respect to the Liouville measure over that macro-state. And we have reasonable dynamical 
grounds to explain why that is a good thing to do. But we are in danger of extrapolating 
this back to a more transcendental principle that it is a priori the case that a system is 
equally likely to be anywhere in phase space. And that is what gets you into skeptical 
catastrophes, where we say that somehow our memory of the past is completely unreliable 
because it is much, much more likely we have fluctuated in from a higher entropy state. 
But I do not think that there are any good reasons on the table to think that the right a priori 
probability distribution is uniform across phase space. There plausibly are good general a 
priori assumptions, coming from general epistemology or general philosophy of science, 
for thinking that the initial macro-state should be a relatively simple, or relatively easily 
describable state, but in gravitational systems that simplistic criterion tends to pick out low 
entropy states rather than high entropy states. For discussion of the conceptual features 
entropy in self-gravitating systems see Wallace [9] and references therein. 

So in conclusion, I am not entirely sure what the fuss is about, so I am not quite sure 
therefore what I should be saying in response to your trilemma at the end. 

Cormac O’Raifeartaigh: Many thanks to both Davids for talks which were extremely 
clear - that is not easy when you are going from discipline to discipline. If I could ask 
in that spirit a very simple question which for the philosophers is probably kindergarten. 
What do you make of Lee Smolin’s view that there is not necessarily a tension between 
the fact that some of our equations in particle physics do not include time, and the way 
that we see time in the observable universe? His simple answer to that [tension] is that 
since the Dirac equation physicists have been haunted by the notion that every solution 
has to represent the real world, whereas in fact mathematics is simply a representation of 
nature. That we fall into the old problem of confusing reality itself with our representation 
of reality? What do you make of people who duck the whole question by saying "there is 
not necessarily a conflict there, this is simply a facet of the way we represent nature"? 

David Wallace: The move of saying “our physics is not fully representing the world” is 
always available, but I think the only really good way to tell if our physics cannot represent 
the world is to try really, really hard to represent the world and see if we fail. I am a bit 
nervous about assuming that it is the case. That is a general philosophical nervousness 
about Smolin’s move. 
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Let me use that as a sort of advert for some of the virtues of the approach in my talk. 
The question of how we reconcile time-symmetric micro-dynamics with the panoply of 
phenomena in the observed universe is so general, has so many different facets, that there 
is all sorts of space for more philosophical attempts to make the problem go away. The 
question of how it is that we can derive the Fokker—Planck equation from Newtonian 
dynamics, given that Newtonian dynamics is time-reversal symmetric and the Fokker— 
Planck equation is not, and that the Fokker—Planck equation is probabilistic and Newtonian 
mechanics is not; that question is much tighter and sharper. The problem is not some 
kind of philosophical paradox; it is a straight mathematical contradiction, so some of the 
assumptions in it must be wrong, and you can push at where those are. 
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25 
Big and Small 


DAVID Z. ALBERT 


Our everyday macroscopic experience of being in the world is saturated with asymme- 
tries — thermodynamic asymmetries, and radiative asymmetries, and epistemic asymme- 
tries, and phenomenological asymmetries, and asymmetries of over-determination, and 
asymmetries of influence, and what have you — between the past and the future. 

And there is a long-cherished hope — something that has its origins in the work of Boltz- 
mann, and which has been pursued, by any number of other investigators, through any 
number of fits and starts and revelations and wrong turns, ever since — that all of those 
asymmetries can ultimately be traced back to some relatively simple characteristic of the 
initial macrocondition of the universe. The thought (as people put it now) is that all we need 
to do, in order to account for these asymmetries, is to add to the fundamental time-reversal- 
symmetric dynamical laws, and to the standard statistical-mechanical probability-measure 
over the space of possible fundamental physical states, a simple postulate — a so-called 
past-hypothesis — to the effect that the world first came into being in whatever particular 
low-entropy macrocondition it is that the normal inferential procedures of cosmology are 
eventually going to present to us. 

The business of working this thought out in detail is a large undertaking, which is still 
very much in its infancy, and which is still very much under debate — and I do not want 
to attempt anything along the lines of an overview of that project here. All I want to talk 
about in this chapter is a widespread and fundamental and perennial sort of puzzlement 
about how such a project could even seriously be entertained — a puzzlement (that is) 
about how it is that the macrocondition of the universe 14 billion years ago — all by itself — 
could even imaginably be up to the job of explaining so much about the feel, now and on 
earth, of the passing of time. 

This puzzlement takes a number of different forms, and arises in a number of different 
contexts. 

On the most trivial level, there is a question of how the lowness of the entropy of the 
world 14 billion years ago can impose any genuinely profound and vivid constraints what- 
ever on what the world is doing now. And all that needs to be said, in order to make that 
sort of puzzlement go away, is that although 14 billion years is a long time, the entropy 
of the universe at that time was very, very low — and that (in particular) 14 billion years 
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is a great deal shorter than the expected relaxation time of the state in which our universe 
seems to have started out. 

There is a somewhat more interesting question about how the lowness of the entropy of 
the world 14 billion years ago can have any genuinely profound and vivid effects, or impose 
any genuinely profound and vivid constraints, on what particular, localized, human-scale, 
quasi-isolated sub-systems of the world are doing now. There is a worry (in particular) 
that runs like this: Suppose we grant that the standard Boltzmannian arguments do indeed 
establish that the overall entropy of a universe which starts out in a low-entropy past- 
hypothesis sort of macrostate is overwhelmingly likely to rise towards the future. That 
does nothing at all (so the worry goes) to show that the entropies of quasi-isolated sub- 
systems of the world are likely to rise in the same direction — and it is the behaviors of 
those latter systems (after all), and not of the universe as a whole, that is the central and 
paradigmatic topic of thermodynamics! 

Jennan Ismael (for example) has written, in precisely this connection, that “It should 
be clear that you can’t in general move from a property of the whole to properties of its 
parts. The fact that dogs can run doesn’t mean that dog heads can run. Forests expand and 
babies grow, but not because the trees or cells that make them up do. The global story isn’t 
just the local story, writ large. There are all kinds of ways in which a system could have 
global dynamical properties that are not properties of any of its parts.” (Ismael, 2016). And 
there is a paper of Eric Winsberg’s (2016) in which he worries that the conditions that are 
required in order to run the standard Boltzmannian argument to the effect that the entropy 
of an isolated system will increase towards the future may somehow be altogether different 
from the conditions that are required in order to run the analogous argument on whatever 
quasi-isolated sub-systems of that larger system happen to emerge as the larger system 
evolves in time. 

It is not clear exactly what this worry is about, or where it comes from (although I 
may be missing something). The Boltzmannian tradition has given us an argument that the 
overwhelming majority of microconditions compatible with any non-equilibrium macro- 
condition of any isolated thermodynamic system are sitting on trajectories on which all of 
its thermodynamic properties — and not merely its overall entropy — are going to evolve 
towards the future, in the general direction of equilibrium, in accord with our usual ther- 
modynamic expectations. If (for example) the isolated system in question consists of two 
gases, which are initially at different temperatures, and which are in thermal contact with 
one another, what the Boltzmannian arguments make plausible is not merely that the over- 
whelming majority of microconditions compatible with this initial macrocondition are 
sitting on trajectories on which the overall entropy of this system is going to increase 
towards the future, but (in addition) that the overwhelming majority of those microcondi- 
tions are sitting on trajectories on which the entropy of the hotter gas is going to decrease 
towards the future. And if the isolated system in question consists of (say) 12 isolated 
gases, in 12 separated containers, each of which is initially far from its own individual 
equilibrium state, then what the Boltzmannian arguments are going to make plausible is 
not merely that the overwhelming majority of microconditions compatible with this initial 
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macrocondition are sitting on trajectories on which the overall entropy of this system will 
rise towards the future, but (in addition) that the overwhelming majority of those micro- 
conditions are sitting on trajectories on which the separate entropies of each of those 12 
gases are all, individually, going to rise towards the future. Of course, the number of micro- 
conditions on which the entropy of some particular one of those gases goes down towards 
the future will be much larger than the number on which the overall entropy of the 12 of 
them together goes down towards the future — but both of those numbers are going to be 
fantastically small, and in both cases the microconditions in question are going to be scat- 
tered, more or less at random, in unimaginably tiny clumps, all over the phase space, and 
the likelihood of the microstate of the system ever wandering into either one of these two 
different sorts of clumps seems, on the face of it, extremely remote. 

But let me go on to the question I really want to address — which is the third and the 
deepest and the most interesting and the most amorphous and the most phenomenological 
form of the general puzzlement about the Boltzmannian project that I mentioned at the 
outset of the chapter. 

Let me put it in four increasingly concrete and increasingly simple and increasingly 
tractable ways: 


1. How can it seriously be imagined that my own sense of the passage of time, how can it 
seriously be imagined (for example) that my own sense — right here and right now — of 
whether some particular baseball happens to be flying towards me or away from me, is 
somehow anchored in the lowness of the entropy of the universe 14 billion years ago? 

2. How can it be, how can it work, that the increase of the entropy of the world, or of 
myself, somehow constitutes the standard or the yardstick against which I judge the 
direction in which events are unfolding? How is it (that is) that the entropy gradient 
of anything ever comes into the picture? I am certainly not aware of checking on the 
entropy gradient of anything in the course of deciding whether the baseball is flying 
towards me or away from me. No comparison with anything else, so far as I am aware, 
is involved. I simply, directly, unmediatedly see that the baseball is flying either towards 
me or away from me. 

3. Consider the sense of the direction of time that is implicit in the operations of (for exam- 
ple) a simple mechanical realization of a Turing machine. Can anyone seriously believe 
that thermodynamical characteristics of the world somehow play a role in the way a 
machine like that distinguishes between what it has just done and what it is to do next? 
How so? How can that be? How would that work? Machines like that can apparently 
function perfectly well, machines like that apparently have no trouble at all distinguish- 
ing between what they have just done and what they are to do next, without the aid 
of special devices for measuring the entropy-gradient of the world, or themselves, or 
anything else! 

4. Consider (finally) a simple mechanical device which has no other business than distin- 
guishing between what it has just done and what it is to do next — the paradigmatic 
distinguisher, the distinguisher par excellence, between what it has just done and what 
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it is to do next. Think (that is) of a clock. And think (for the sake of concreteness, for 
the sake of simplicity) of an old-fashioned, fully mechanical, pendulum-clock. 


Good. Now we have our hands on something that we are in a position to analyze in 
detail. 

Note that in the course of the normal and intended operations of a clock like that, there 
are going to be moments — the moments (in particular) when the pendulum is precisely 
at the apogee of its swing — when every last one of its macroscopic moving parts is fully 
at rest. Note (to put it slightly differently) that in the course of the normal and intended 
operations of a clock like that, there are going to be moments — the moments (again) when 
the pendulum is precisely at the apogee of its swing -when the macrocondition of the clock, 
in its entirety, is invariant under time-reversal. And consider how it is, at such moments, 
that the clock manages to distinguish between what it has just done and what it is to do 
next. 

The macrocondition of the clock, together with the microscopic dynamical equations 
of motion, together with the statistical postulate, is manifestly not going to do the trick. 
For if the present macrocondition of the clock together with the microscopic dynam- 
ical equations of motion and the statistical postulate makes it likely that the clock is 
going to read (say) 3:05 five minutes from now, and if the present macrocondition of 
the clock is invariant under time-reversal, then the present macrocondition of the clock, 
together with the microscopic dynamical equations of motion and the statistical postu- 
late — both of which are invariant under time-reversal as well — is necessarily also going 
to make it likely, and to exactly the same degree, that the clock read 3:05 five minutes 
ago. 

And all there is to break the symmetry, all there is that stands in the way of the clock’s 
having read 3:05 five minutes ago, is the past-hypothesis. The clock’s ability to distinguish 
between what it did last and what it does next, and your ability to distinguish between 
a baseball’s flying towards you and a baseball’s flying away from you, are anchored in 
the entropy-gradient of the universe. If we were to hold the present macrocondition of the 
world fixed, and move the past-hypothesis from the beginning of time to its end, the clock 
would run backwards. 

Period. 

People sometimes find it hard to take this in. Consider (for example) the following reac- 
tion, from a well-known theoretical physicist, which is worth quoting exactly, and in its 
entirety: 


It’s uncharacteristic of Albert to pass over details. He could’ve described how a pendulum clock 
works (e.g. falling weight version) in a couple of minutes, but he didn’t. The mechanism CANNOT 
run backwards. What drives it is the falling of the weight, pulling on the cord. If there’s no pull on the 
cord the clock stops. If there’s PUSH on the cord the clock stops. It’Il ONLY work if there’s tension 
on the cord and that will make the hands move clockwise because of the way the cord is wound 
around the drive axle. And that would be true even if the weight rose into the air and started pulling 
upwards. Putting the low entropy in the future of the universe can’t make the clock run backwards, 
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as he claims. The clock "knows" which way to go, if it’s going to go at all, because the information 
is built into it in the way that the cord is wound around the axle. 


It is hard to know exactly what to say about a worry like that — except to repeat, maybe a 
little louder, that it is simply and radically mistaken. Absolutely no such additional details 
about how the clock works can make the slightest bit of difference. Consider (again) a 
moment when the macroscopic state of the clock is stationary. Consider (that is) a moment 
when the macroscopic state of the clock is invariant under a reversal of all of the velocities 
of its microscopic particulate constituents. In the absence of a past-hypothesis, the predic- 
tions of statistical mechanics about the evolution of the macroscopic condition of this clock 
away from that moment towards the future are going to be identical to its predictions about 
the evolution of its macroscopic condition away from that moment towards the past, for the 
simple reason that there is nothing whatever in the situation to distinguish between them. 
In the absence of a past-hypothesis, the predictions of statistical mechanics are that as we 
proceed away from the present, in either temporal direction, the cord is always going to be 
unwinding, and the weight is always going to be going down, and the hands of the clock are 
always going to be turning in the clockwise direction. And it is only because of the truth of 
the past-hypothesis that (as a matter of actual fact) those hands turn counter-clockwise, as 
we proceed away from the present, in the direction of the past. And if we were to switch 
the low-entropy condition from the beginning of the universe to its end, then everything 
would be the other way around. 

It turns out (then) to be essential to the intended functioning of a pendulum clock, or of 
a Turing machine, or of a human brain — it turns out (that is) to be precisely the opposite 
of an irrelevancy or an inconvenience or a potential source of error — that it be in thermal 
disequilibrium. A pendulum clock — no less than a puff of smoke or a block of ice — is 
(among other things) an instrument for measuring the entropy-gradient of the world. A 
pendulum clock is (more particularly) an instrument whose hands cannot help but to turn 
clockwise — for exactly the sorts of reasons that are spelled out in the quotation from the 
physicist above — in the temporal direction that points away from the past-hypothesis. That 
is how they differ from (say) projectiles, or gyroscopes: a projectile can just as easily move 
this way as that, and a gyroscope can just as easily turn clockwise as counter-clockwise, 
as the entropy of the world increases — and so they are no good at all (unlike a pendulum 
clock, or a block of ice, or me, or you) at telling past from future. 
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